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Hepatitis C virus diagnostics and vaccines 


A pharmaceutical composition comprising a 
hepatitis C virus (HCV) antisense polynucleotide which 
is selectively hybridizable for the sense strand of the ge- 
nome of an HCV, wherein the polynucleotide comprises 
a contiguous sequence of at least 8 nucleotides com- 
plementary to the sense strand of the genome of an 
HCV and a pharmaceutical^ acceptable excipient, 


wherein HCV is characterized by: 

(i) a positive stranded RNA genome; 

(ii) said genome comprising an open reading frame 
(ORF) encoding a poly protein; and 

(iii) said polyprotein is encoded by an HCV genome 
wherein the HCV genome has an homology at the 
polypeptide level of at least 40% to the 169 amino 
acid sequence shown in Figure 11. 
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Description 

Technical Field 

5 [0001] The invention relates to materials and methodologies for managing the spread of non-A, non-B hepatitis virus 
(NANBV) infection. More specifically, it relates to polynucleotides derived from the genome of an etiologic agent of 
NANBH, hepatitis C virus (HCV), to polypeptides encoded therein, and to antibodies directed to the polypeptides. 
These reagents are useful as screening agents for HCV and its infection, and as protective agents against the disease. 
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'0 Background Art 

[0004] Non-A, Non-B hepatitis (NANBH) is a transmissible disease or family of diseases that are believed to be viral- 
induced, and that are distinguishable from other forms of viral-associated liver diseases, including that caused by the 
known hepatitis viruses, i.e., hepatitis A virus (HAV), hepatitis B virus (HBV), and delta hepatitis virus (HDV), as well 
'5 as the hepatitis induced by cytomegalovirus (CM V) or Epstein-Barr virus (EBV). NANBH was first identified in transfused 
individuals. Transmission from man to chimpanzee and serial passage in chimpanzees provided evidence that NANBH 
is due to a transmissible infectious agent or agents. 

[0005] Epidemiologic evidence is suggestive that there may be three types of NANBH: the water-borne epidemic 
type; the blood or needle associated type; and the sporadically occurring (community acquired) type. However, the 

20 number of agents which may be the causative of NANBH are unknown. 

[0006] Clinical diagnosis and identification of NANBH has been accomplished primarily by exclusion of other viral 
markers. Among the methods used to detect putative NANBV antigens and antibodies are agar-gel diffusion, counter- 
immunoelectrophoresis, immunofluorescence microscopy, immune electron microscopy, radioimmunoassay, and en- 
zyme-linked immunosorbent assay. However, none of these assays has proved to be sufficiently sensitive, specific, 

25 and reproducible to be used as a diagnostic test for NANBH. 

[0007] Previously there was neither clarity nor agreement as to the identity or specificity of the antigen antibody 
systems associated with agents of NANBH. This was due, at least in part, to the prior or co-infection of HBV with 
NANBV in individuals, and to the known complexity of the soluble and particulate antigens associated with HBV, as 
well as to the integration of HBV DNA into the genome of liver cells. In addition, there is the possibility that NANBH is 

30 caused by more than one infectious agent, as well as the possibility that NANBH has been mis-diagnosed. Moreover, 
rt is unclear what the serological assays detect in the serum of patients with NANBH. It has been postulated that the 
agar-gel diffusion and counterimmunoelectrophoresis assays detect autoimmune responses or nonspecific protein 
interactions that sometimes occur between serum specimens, and that they do not represent specific NANBV antigen- 
antibody reactions. The immunofluorescence, and enzyme-linked immunosorbent, and radioimmunoassays appear to 

35 detect low levels of a rheumatoid-factor-like material that is frequently present in the serum of patients with NANBH 
as well as in patients with other hepatic and nonhepatic diseases. Some of the reactivity detected may represent 
antibody to host-determined cytoplasmic antigens. 

[0008] There have been a number of candidate NANBV. See, for example the reviews by Prince (1983), Feinstone 
and Hoofnagle (1984), and Overby (1985, 1986, 1987) and the article by Iwarson (1987). However, there is no proof 

40 that any of these candidates represent the etiological agent of NANBH. 

[0009] The demand for sensitive, specific methods for screening and identifying carriers of NANBV and NANBV 
contaminated blood or blood products is significant. Post-transfusion hepatitis (PTH) occurs in approximately 10% of 
transfused patients, and NANBH accounts for up to 90% of these cases. The major problem in this disease is the 
frequent progression to chronic liver damage (25-55%). 

45 [0010] Patient care as well as the prevention of transmission of NANBH by blood and blood products or by close 
personal contact require reliable screening, diagnostic and prognostic tools to detect nucleic acids, antigens and an- 
tibodies related to NANBV. In addition, there is also a need for effective vaccines and immunotherapeutic therapeutic 
agents for the prevention and/or treatment of the disease. 

[0011] Applicant discovered a new virus, the Hepatitis C virus (HCV), which has proven to be the major etiologic 
50 agent of blood-borne NANBH (BB-NANBH). Applicant's initial work, including a partial genomic sequence of the pro- 
totype HCV isolate, CDC/HCV1 (also called HCV1), is described in EPO Pub. No. 318,216 (published 31 May 1989) 
and PCT Pub. No. WO 89/04669 (published 1 June 1989). The disclosures of these patent applications, as well as 
any corresponding national patent applications, are incorporated herein by reference. These applications teach, inter 
alia, recombinant DNA methods of cloning and expressing HCV sequences, HCV polypeptides, HCV immunodiagnostic 
55 techniques, HCV probe diagnostic techniques, anti-HCV antibodies, and methods of isolating new hCV sequences, 
including sequences of new HCV isolates. 
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Disclosure of the Invention 

[0012] The present invention is based, in part, on new HCV sequences and polypeptides that are not disclosed in 
EPO Pub. No. 318,216, or in PCT Pub. No. WO 89/04869. Included within the invention is the application of these new 

5 sequences and polypeptides in, inter alia, immunodiagnostics, probe diagnostics, anti-HCV antibody production, PCR 
technology and recombinant DNA technology. Included within the invention, also, are new immunoassays based upon 
the immunogenicity of HCV polypeptides disclosed herein. The new subject matter claimed herein, while developed 
using techniques described in, for example, EPO Pub. No. 318,216, has a priority date which antecedes that publication, 
or any counterpart thereof. Thus, the invention provides novel compositions and methods useful for screening samples 

'0 for HCV antigens and antibodies, and useful for treatment of HCV infections. 

[001 3] Accordingly, one aspect of the invention is a recombinant polynucleotide comprising a sequence derived from 
HCV cDNA, wherein the HCV cDNA is in clone 1 3i, or clone 26j, or clone 59a, or clone 84a, or clone CA1 56e, or clone 
167b, or clone pil4a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16jh, 
or wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -31 9 to 1 348 or 8659 to 8866 in Fig. 17. 

15 [0014] Another aspect of the invention is a purified polypeptide comprising an epitope encoded within HCV cDNA 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 
[0015] Yet another aspect of the invention is an immunogenic polypeptide produced by a cell transformed with a 
recombinant expression vector comprising an ORF of DNA derived from HCV cDNA, wherein the HCV cDNA is com- 
prised of a sequence derived from the HCV cDNA sequence in clone CA279a, or clone CA74a, or clone 1 3i, or clone 

20 CA290a, or clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 1 4c, or clone 8f , or clone 33f , or clone 33g, 
or clone 39c, or clone 1 5e, and wherein the ORF is operably linked to a control sequence compatible with a desired host. 
[001 6] Another aspect of the invention is a peptide comprising an HCV epitope, wherein the peptide is of the formula 


25 


AA^~A Ay , 


30 


35 


40 


45 


50 


55 


wherein x and y designate amino acid numbers shown in Fig. 1 7, and wherein the peptide is selected from the group 
consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA177, AA1-AA10, AA5-AA20, AA20-AA25, AA35-AA45, 
AA50-AA100, AA40-AA90, AA45-AA65, AA65-AA75, AA80-90., AA99-AA120, AA95-AA110, AA105-AA120, 
AA100-AA150, AA150-AA200, AA155-AA170, AA190-AA210, AA200-AA250, AA220-AA240, AA245-AA265, 
AA250-AA300, AA290-AA330, AA290-305, AA300-AA350, AA310-AA330, AA350-AA400, AA380-AA395, 
AA405-AA495, AA400-AA450, AA405-AA415, AA415-AA425, AA425-AA435, AA437-AA582, AA450-AA500, 
AA440-AA460, AA460-AA470, AA475-AA495, AA500-AA550, AA511-AA690, AA515-AA550, AA550-AA600, 
AA550-AA625, AA575-AA605, AA585-AA600, AA600-AA650, AA600-AA625, AA635-AA665, AA650-AA700, 
AA645-AA680, AA700-AA750, AA700-AA725, AA700-AA750, AA725-AA775, AA770-AA790, AA750-AA800, 
AA800-AA815, AA825-AA850, AA850-AA875, AA800-AA850, AA920-AA990, AA850-AA900, AA920-AA945, 
AA940-AA965, AA970-AA990, AA950-AA1000, AA1000-AA1060, AA1000-AA1025, AA1000-AA1050, 

AA1040-AA1055, AA1075-AA1175, AA1050-AA1200, AA1070-AA1100, 


AA1025-AA1040, 
AA1140-AA1165, 
AA1260-AA1310, 
AA1345-AA1405, 
AA1450-AA1500, 
AA1515-AA1550, 
AA1590-AA1650, 
AA1694-AA1735, 
AA1850-AA1900, 
AA1 950-AA2000, 
AA2045-AA2100, 
AA2200-AA2250, 
AA2287-AA2385, 
AA2348-AA2464, 
AA2400-AA2425, 
AA2505-AA2540, 
AA2620-AA2650, 


AA1192-AA1457, 
AA1260-AA1280, 
AA1345-AA1365, 
AA1460-AA1475, 
AA1550-AA1600, 
AA1610-AA1645, 
AA1720-AA1745, 
AA1900-AA1950, 
AA1950-AA1985, 
AA2045-AA2070, 
AA2200-AA2325, 
AA2300-AA2350, 
AA2345-AA2415, 
AA2415-AA2450, 
AA2535-AA2560, 
AA2640-AA2660, 


AA1195-AA1250, 
AA1266-AA1428, 
AA1350-AA1400, 
AA1475-AA1515, 
AA1545-AA1560, 
AA1650-AA1690, 
AA1745-AA1770, 
AA1900-AA1920, 
AA1980-AA2000, 
AA2054-AA2223, 
AA2250-AA2330, 
AA2290-AA2310, 
AA2345-AA2375, 
AA2445-AA2500, 
AA2550-AA2600, 
AA2650-AA2700, 


AA1200-AA1225, 
AA1300-AA1350, 
AA1365-AA1380, 
AA1475-AA1500, 
AA1569-AA1931, 
AA1685-AA1770, 
AA1750-AA1800, 
AA1916-AA2021, 
AA2000-AA2050, 
AA2070-AA2100, 
AA2255-AA2270, 
AA2310-AA2330, 
AA2370-AA2410, 
AA2445-AA2475, 
AA2560-AA2580, 
AA2655-AA2670, 


AA1225-AA1250 
AA1290-AA1310 
AA1380-AA1405 
AA1500-AA1550 
AA1570-AA1590 
AA1689-AA1 805 
AA1775-AA1810 
AA1920-AA1940 
AA2005-AA2025 
AA2100-AA2150 
AA2265-AA2280 
AA2330-AA2350 
AA2371-AA2502 
AA2470-AA2490 
AA2600-AA2650 
AA2670-AA2700 


AA1100-AA1130, 
AA1250-AA1300, 
AA1310-AA1340, 
AA1400-AA1450, 
AA1500-AA1515, 
AA1595-AA1610, 
AA1690-AA1720, 
AA1795-AA1850, 
AA1949-AA2124, 
AA2020-AA2045, 
AA2150-AA2200, 
AA2280-AA2290, 
AA2350-AA2400, 
AA2400-AA2450, 
AA2500-AA2550, 
AA2605-AA2620, 
AA2700-AA2750, 


AA2740-AA2760, 

AA2750-AA2800, AA2755-AA2780, 
AA2780-AA2830, AA2785-AA281 0, 


AA2796-AA2886, AA2810-AA2825, AA2800-AA2850, AA2850-AA2900, 


AA2850-AA2865, AA2885-AA2905, AA2900-AA2950, AA2910-AA2930, AA2925-AA2950, AA2945-end(C terminal). 
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[0017] Still another aspect of the invention is a monoclonal antibody directed against an epitope encoded in HCV 
cDNA, wherein the HCVcDNA is of a sequence indicated by nucleotide numbers -31 9 to 1348 or 8659 to 8866 in Fig. 
17, or is the sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 167b, 
or clone pi!4a, or clone CA21 6a, or clone CA290a, or clone ag30a, or clone 205a, or clone 1 8g, or clone 1 6jh. 

5 [001 8] Yet another aspect of the invention is a preparation of purified polyclonal antibodies directed against a polypep- 
tide comprised of an epitope encoded within HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucle- 
otide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in in clone 1 3i, or clone 26j, or clone 
59a, or clone 84a, or clone CA1 56e, or clone 1 67b, or clone pil4a, or clone CA21 6a, or clone CA290a, or clone ag30a, 
or clone 205a, or clone 18g, or clone 16jh. 

io [0019] Still another aspect of the invention is a polynucleotide probe for HCV, wherein the probe is comprised of an 
HCV sequence derived from an HCVcDNA sequence indicated by nucleotide numbers -31 9 to 1348 or 8659 to 8866 
in Fig. 17, or from the complement of the HCV cDNA sequence. 

[0020] Yet another aspect of the invention is a kit for analyzing samples for the presence of polynucleotides from 
HCV comprising a polynucleotide probe containing a nucleotide sequence of about 8 or more nucleotides, wherein the 
>s nucleotide sequence is derived from HCV cDNA which is of a sequence indicated by nucleotide numbers - 319 to 1 348 
or 8659 to 8866 in Fig. 17, wherein the polynucleotide probe is in a suitable container. 

[0021] Another aspect of the invention is a kit for analyzing samples for the presence of an HCV antigen comprising 
an antibody which reacts immunologically with an HCV antigen, wherein the antigen contains an epitope encoded 
within HCV cDNAwhich is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or 

20 wherein the HCV cDNA is in clone 1 3i, or clone 26j, or clone 59a, or clone 84a, or clone CA1 56e, or clone 1 67b, or 
clone pil4a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 1 6jh. 
[0022] Yet another aspect of the invention is a kit for analyzing samples for the presence of an HCV antibody com- 
prising an antigenic polypeptide containing an HCV epitope encoded within HCV cDNA which is of a sequence indicated 
by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 1 7, or is in clone 13i, or clone 26j, or clone 59a, or clone 

25 84a, or clone CA156e, or clone 167b, or clone pil4a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 
205a, or clone 1 8g, or clone 1 6jh. 

[0023] Another aspect of the invention is a kit for analyzing samples for the presence of an HCV antibody comprising 
an antigenic polypeptide expressed from HCV cDNA in clone CA279a, or clone CA74a, or clone 1 3i, or clone CA290a, 
or clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 1 4c, or clone 8f, or clone 33f, or clone 33g, or clone 

30 39c, or clone 15e, wherein the antigenic polypeptide is present in a suitable container. 

[0024] Still another aspect of the invention is a method for detecting HCV nucleic acids in a sample comprising: 

(a) reacting nucleic acids of the sample with a polynucleotide probe for HCV, wherein the probe is comprised of 
an HCV sequence derived from an HCV cDNA sequence is of a sequence indicated by nucleotide numbers -319 to 
1348 or 8659 to 8866 in Fig. 17, and wherein the reacting is under conditions which allow the formation of a polynu- 

35 cleotide duplex between the probe and the HCV nucleic acid from the sample; and (b) detecting a polynucleotide duplex 
which contains the probe, formed in step (a). 

[0025] Yet another aspect of the invention is an immunoassay for detecting an HCV antigen comprising: 

(a) incubating a sample suspected of containing an HCV antigen with an antibody directed against an HCV 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 

40 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a, or 
clone CA156e, or clone 167b, or clone pil4a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or 
clone 1 8g, or clone 1 6jh, and wherein the incubating is under conditions which allow formation of an antigen-antibody 
complex; and (b) detecting an antibody-antigen complex formed in step (a) which contains the antibody. 
[0026] Still another aspect of the invention is an immunoassay for detecting antibodies directed against an HCV 

45 antigen comprising: 

(a) incubating a sample suspected of containing anti-HCV antibodies with an antigen polypeptide containing an 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 
1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a, or 
clone CA156e, or clone 167b, or clone pil4a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or 

so clone 1 8g, or clone 1 6jh, and wherein the incubating is under conditions which allow formation of an antigen-antibody 
complex; and detecting an antibody-antigen complex formed in step (a) which contains the antigen polypeptide. 
[0027] Another aspect of the invention is a vaccine for treatment of HCV infection comprising an immunogenic 
polypeptide containing an HCV epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17 or is the sequence present in clone 13i, or clone 26j, or 

55 clone 59a, or clone 84a, or clone CA1 56e, or clone 1 67b, or clone pil4a, or clone CA21 6a, or clone CA290a, or clone 
ag30a, or clone 205a, or clone 18g, or clone 16jh, and wherein the immunogenic polypeptide is present in a pharma- 
cologically effective dose in a pharmaceutical ly acceptable exciptent. 

[0028] Yet another aspect of the invention is a method for producing antibodies to HCV comprising administering to 
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an individual an isolated immunogenic polyeptide containing an HCV epitope encoded in HCVcDNA, wherein the HCV 
cDNA is of a sequence indicated by nucleotide numbers -31 9 to 1 348 or 8659 to 8866 in Fig. 1 7, or is of the sequence 
present in clone CA279a, or clone CA74a, or clone 1 3i, or clone CA290a, or clone 33C or clone 40b, or clone 33b, or 
clone 25c, or clone 1 4c, or clone 8f , or clone 33f, or clone 33g, or clone 39c, or clone 1 5e, and wherein the immunogenic 
5 polypeptide is present in a pharmacologically effective dose in a pharmaceutical^ acceptable excipient. 

[0029] Still another aspect of the invention is an antisense polynucleotide derived from HCV cDNA, wherein the HCV 
cDNA is that shown in Fig. 17. 

[0030] Yet another aspect of the invention is a method for preparing purified fusion polypeptide C100-3 comprising: 

10 (a) providing a crude cell lysate containing polypeptide C100-3, 

(b) treating the crude cell lysate with an amount of acetone which causes the polypeptide to precipitate, 

(c) isolating and solubilizing the precipitated material, 

(d) isolating the C100-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the C100-3 polypeptide of step (d) by gel filtration. 
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Brief Description of the Drawings 


[0031] Fig. 1 shows the sequence of the HCV cDNA in clone 12f, and the amino acids encoded therein. 
[0032] Fig. 2 shows the HCV cDNA sequence in clone k9-1 , and the amino acids encoded therein. 
20 [0033] Fig. 3 shows the sequence of clone 1 5e, and the amino acids encoded therein. 

[0034] Fig. 4 shows the nucleotide sequence of HCV cDNA in clone 1 3i, the amino acids encoded therein, and the 
sequences which overlap with clone 12f. 

[0035] Fig. 5 shows the nucleotide sequence of HCV cDNA in clone 26j, the amino acids encoded therein, and the 
sequences which overlap clone 13i. 
25 [0036] Fig. 6 shows the nucleotide sequence of HCV cDNA in clone CA59a, the amino acids encoded therein, and 
the sequences which overlap with clones 26j and K9-1 . 

[0037] Fig. 7 shows the nucleotide sequence of HCV cDNA in clone CA84a, the amino acids encoded therein, and 
the sequences which overlap with clone CA59a. 

[0038] Fig. 8 shows the nucleotide sequence of HCV cDNA in clone CA1 56e, the amino acids encoded therein, and 
30 the sequences which overlap with CA84a. 

[0039] Fig. 9 shows the nucleotide sequence of HCV cDNA in clone CA1 67b, the amino acids encoded therein, and 
the sequences which overlap CA1 56e. 

[0040] Fig. 10 shows the nucleotide sequence of HCV cDNA in clone CA216a, the amino acids encoded therein, 
and the overlap with clone CA167b. 
35 [0041] Fig. 11 shows the nucleotide sequence of HCV cDNA in clone CA290a, the amino acids encoded therein, 
and the overlap with clone CA21 6a. 

[0042] Fig. 1 2 shows the nucleotide sequence of HCV cDNA in clone ag30a and the overlap with clone CA290a. 
[0043] Fig. 13 shows the nucleotide sequence of HCV cDNA in clone CA205a, and the overlap with the HCV cDNA 
sequence in clone CA290a. 

40 [0044] Fig. 14 shows the nucleotide sequence of HCV cDNA in clone 18g, and the overlap with the HCV cDNA 
sequence in clone ag30a. 

[0045] Fig. 15 shows the nucleotide eequence of HCV cDNA in clone 16jh, the amino acids encoded therein, and 
the overlap of nucleotides with the HCV cDNA sequence in clone 15e. 

[0046] Fig. 1 5 shows the ORF of HCV cDNA derived from clones pi1 4a, CA1 67b, CA1 56e, CA84a, CA59a, K9-1 . 

45 I2f, 14i, 11b, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 32, 33b, 25c, 14c, 8f, 33f, 33g, 39c, 35f, 19g, 26g, and 15e. 

[0047] Fig. 17 shows the sense strand of the compiled HCV cDNA sequence derived from the above-described 
clones and the compiled HCV cDNA sequence published in EPO Pub. No. 318,216. The clones from which the se- 
quence was derived are b114a, 18g, ag30a, CA205a, CA290a, CA216a, pi14a, cA167b, CA156e, CA84a, CA59a, 
K9-1 (also called k9-1),26j, 13i,'12t, 14i, lib, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81. 32, 33b, 25c, 14c, 8f, 33f, 33g, 39c, 

50 35f, 19g, 26g, 15e, b5a, and 16jh. In the figure the three horizontal dashes above the sequence indicate the position 
of the putative initiator methionine codon; the two vertical dashes indicate the first and last nucleotides of the published 
sequence. Also shown in the figure is the amino acid sequence of the putative polyprotein encoded in the HCV cDNA. 
[0048] Fig. 18 is a diagram of the immunological colony screening method used in antigenic mapping studies. 
[0049] Fig. 19 shows the hydrophobic ity profiles of poly proteins encoded in HCV and in West Nile virus. 

55 [0050] Fig. 20 is a tracing of the hydrophilicity/hydrophobicity profile and of the antigenic index of the putative HCV 
polyprotein. 

[0051] Fig. 21 shows the conserved co-linear peptides in HCV and Flaviviruses. 
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Modes for Carrying Out the Invention 
I. Definitions 

5 [0052] The term "hepatitis C virus"* has been reserved by workers in the field for an heretofore unknown etiologic 
agent of NANBH. Accordingly, as used herein, "hepatitis C virus" (HCV) refers to an agent causitive of NANBH, which 
was formerly referred to as MAN BV and/or BB-NANBV. The terms HCV, NANBV, and BB-NANBV are used interchange- 
ably herein. As an extension of this terminology, the disease caused by HCV, formerly called NANB hepatitis (NANBH), 
is called hepatitis C. The terms NANBH and hepatitis C may be used interchangeably herein. 

w [0053] The term "HCV", as used herein, denotes a viral species of which pathogenic strains cause NANBH, and 
attenuated strains or defective interfering particles derived therefrom. As shown infra., the HCV genome is comprised 
of RNA. It is known that RNA containing viruses have relatively high rates of spontaneous mutation, i.e., reportedly on 
the order of 10 -3 to 10' 4 per incorporated nucleotide (Fields & Knipe (1986)). Therefore, there are multiple strains, 
which may be virulent or avirulent, within the HCV species described infra. The compositions and methods described 

*5 herein, enable the propagation, identification, detection, and isolation of the various HCV strains or isolates. Moreover, 
the disclosure herein allows the preparation of diagnostics and vaccines for the various strains, as well as compositions 
and methods that have utility in screening procedures for anti-viral agents for pharmacologic use, such as agents that 
inhibit replication of HCV. 

[0054] The information provided herein, although derived from the prototype strain or isolate of HCV, hereinafter 

20 referred to as CDC/HCV1 (also called HCV1 ), is sufficient to allow a viral taxonomist to identify other strains which fall 
within the species. The information provided herein allows the belief that HCV is a Flavi-like virus. The morphology 
and composition of Flavivirus particles are known, and are discussed in Brinton (1986). Generally, with respect to 
morphology, Flaviviruses contain a central nucleocapsid surrounded by a lipid bilayer. Virions are spherical and have 
a diameter of about 40-50 nm. Their cores are about 25-30 nm in diameter. Alongthe outer surface of the virion envelope 

25 are projections that are about 5-10 nm long with terminal knobs about 2 nm in diameter. 

[0055] Different strains or isolates of HCV are expected to contain variations at the amino acid and nucleic acids 
compared with the prototype isolate, HCV1. Many isolates are expected to show much (i.e. more than about 40%) 
homology in the total amino acid sequence compared with HCV1. However, ft may also be found that other less ho- 
mologous HCV isolates. These would be defined as HCV strains according to various criteria such as an ORF of 

so approximately 9,000 nucleotides to approximately 1 2,000 nucleotides, encoding a polyprotein similar in size to that of 
HCV1, an encoded polyprotein of similar hydrophobic and antigenic character to that of HCV1, and the presence of 
co-linear peptide sequences that are conserved with HCV1 . In addition, the genome would be a positive-stranded RNA. 
[0056] HCV encodes at least one epitope which is immunologically identifiable with an epitope in the HCV genome 
from which the cDNAs described herein are derived; preferably the epitope is contained an amino acid sequence 

35 described herein. The epitope is unique to HCV when compared to other known Flaviviruses. The uniqueness of the 
epitope may be determined by its immunological reactivity with anti-HCV antibodies and lack of immunological reactivity 
with antibodies to other Flavivirus species. Methods for determining immunological reactivity are known in the art, for 
example, by radioimmunoassay, by Elisa assay, by hemagglutination, and several examples of suitable techniques for 
assays are provided herein. 

6 [0057] In addition to the above, the following parameters of nucleic acid homology and amino acid homology are 
applicable, either alone or in combination, in identifying a strain or isolate as HCV Since HCV strains and isolates are 
evolutionary related, it is expected that the overall homology of the genomes at the nucleotide level probably will be 
about 40% or greater, probably about 60% or greater, and even more probably about 80% or greater; and in addition 
that there will be corresponding contiguous sequences of at least about 13 nucleotides. The correspondence between 

45 the putative HCV strain genomic sequence and the CDC/HCV1 cDNA sequence can be determined by techniques 
known in the art. For example, they can be determined by a direct comparison of the sequence information of the 
polynucleotide from the putative HCV, and the HCV cDNA sequence(s) described herein. For example, also, they can 
be determined by hybridization of the polynucleotides under conditions which form stable duplexes between homolo- 
gous regions (for example, those which would be used prior to S, digestion), followed by digestion with single stranded 

50 specific nuclease (s), followed by size determination of the digested fragments. 

[0058] Because of the evolutionary relationship of the strains or isolates of HCV, putative HCV strains or isolates are 
identifiable by their homology at the polypeptide level. Generally, HCV strains or isolates are expected to be more than 
about 40% homologous, probably more than about 70% homologous, and even more probably more than about 80% 
homologous, and some may even be more than about 90% homologous at the polypeptide level. The techniques for 

55 determining amino acid sequence homology are known in the art. For example, the amino acid sequence may be 
determined directly and compared to the sequences provided herein. Alternatively the nucleotide sequence of the 
genomic material of the putative HCV may be determined (usually via a cDNA intermediate), the amino acid sequence 
encoded therein can be determined, and the corresponding regions compared. 
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[0059] As used herein, a polynucleotide "derived from" a designated sequence refers to a polynucleotide sequence 
which is comprised of a sequence of approximately at least about 6 nucleotides, preferably at least about 8 nucleotides, 
more preferably at least about 10-12 nucleotides, and even more preferably at least about 15-20 nucleotides corre- 
sponding to a region of the designated nucleotide sequence. "Corresponding" means homologous to or complementary 

5 to the designated sequence. Preferably, the sequence of the region from which the polynucleotide is derived is homol- 
ogous to or complementary to a sequence which is unique to an HCV genome. Whether or not a sequence is unique 
to the HCV genome can be determined by techniques known to those of skill in the art. For example, the sequence 
can be compared to sequences in databanks, e.g., Genebank, to determine whether it is present in the uninfected host 
or other organisms. The sequence can also be compared to the known sequences of other viral agents, including those 

to which are known to induce hepatitis, e.g., HAV, HBV, and HDV, and to other members of the Flaviviridae. The corre- 
spondence or non-correspondence of the derived sequence to other sequences can also be determined by hybridiza- 
tion under the appropriate stringency conditions. Hybridization techniques for determining the complementarity of nu- 
cleic acid sequences are known in the art, and are discussed infra. See also, for example, Maniatis et al. (1982). In 
addition, mismatches of duplex polynucleotides formed by hybridization can be determined by known techniques, 

>5 including for example, digestion with a nuclease such as S1 that specifically digests single-stranded areas in duplex 
polynucleotides. Regions from which typical DNA sequences may be "derived" include but are not limited to, for ex- 
ample, regions encoding specific epitopes, as well as non-transcribed and/or non-translated regions. 
[0060] The derived polynucleotide is not necessarily physically derived from the nucleotide sequence shown, but 
may be generated in any manner, including for example, chemical synthesis or DNA replication or reverse transcription 

20 or transcription. In addition, combinations of regions corresponding to that of the designated sequence may be modified 
in ways known in the art to be consistent with an intended use. 

[0061] Similarly, a polypeptide or amino acid sequence "derived from" a designated nucleic acid sequence refers to 
a polypeptide having an amino acid sequence identical to that of a polypeptide encoded in the sequence, or a portion 
thereof wherein the portion consists of at least 3-5 amino acids, and more preferably at least 8-10 amino acids, and 
25 even more preferably at least 11-15 amino acids, or which is immunologically identifiable with a polypeptide encoded 
in the sequence. 

[0062] A recombinant or derived polypeptide is not necessarily translated from a designated nucleic acid sequence, 
for example, the HCV cDNA sequences described herein, or from an HCV genome; it may be generated in any manner, 
including for example, chemical synthesis, or expression of a recombinant expression system, or isolation from mutated 
30 HCV. A recombinant or derived polypeptide may include one or more analogs of amino acids or unnatural amino acids 
in its sequence. Methods of inserting analogs of amino acids into a sequence are known in the art. It also may include 
one or more labels, which are known to those of skill in the art. 

[0063] The term "recombinant polynucleotide" as used herein intends a polynucleotide of genomic, cDNA, semisyn- 
thetic, or synthetic origin which, by virtue of its origin or manipulation which: (1) is not associated with all or a portion 
35 of a polynucleotide with which it is associated in nature, (2) is linked to a polynucleotide other than that to which it is 
linked in nature, or (3) does not occur in nature. 

[0064] The term "polynucleotide" as used herein refers to a polymeric form of nucleotides of any length, either ribo- 
nucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term 
includes double- and single-stranded DNA, as well as double- and single stranded RNA. It also includes known types 

40 of modifications, for example, labels which are known in the art, methylation, "caps", substitution of one or more of the 
naturally occurring nucleotides with an analog, intern ucleotide modifications such as, for example, those with un- 
charged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates, etc.) and with charged 
linkages (e.g., phosphorothioates, phosphorodithioates, etc.), those containing pendant moieties, such as, for example 
proteins (including for e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators 

45 (e.g., acridine, psoralen, etc.), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, 
etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as well as 
unmodified forms of the polynucleotide. 

[0065] The term "purified viral poly nucleotide" refers to an HCV genome or fragment thereof which is essentially free, 
i.e., contains less than about 50%, preferably less than about 70%, and even more preferably less than about 90% of 
50 polypeptides with which the viral polynucleotide is naturally associated. Techniques for purifying viral polynucleotides 
from viral particles are known in the art, and include for example, disruption of the particle with a chaotropic agent, 
differential extraction and separation of the polynucleotide(s) and polypeptides by ion-exchange chromatography, af- 
finity chromatography, and sedimentation according to density. 

[0066] The term "purified viral polypeptide" refers to an HCV polypeptide or fragment thereof which is essentially 
55 free, i.e., contains less than about 50%, preferably less than about 70%, and even more preferably less than about 
90%, of cellular components with which the viral polypeptide is naturally associated. Techniques for purifying viral 
polypeptides are known in the art, and examples of these techniques are discussed infra. The term "purified viral 
polynucleotide" refers to an HCV genome or fragment thereof which is essentially free, i.e., contains less than about 
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20%, preferably less than about 50%, and even more preferably less than about 70% of polypeptides with which the 
viral polynucleotide is naturally associated. Techniques for purifying viral polynucleotides from viral particles are known 
in the art, and include for example, disruption of the particle with a chaotropic agent, and separation of the polynucleotide 
(s) and polypeptides by ion-exchange chromatography, affinity chromatography, and sedimentation according to den- 
sity. 

[0067] "Recombinant host cells", "host cells", "cells", "cell lines", "cell cultures", and other such terms denoting mi- 
croorganisms or higher eukaryotic cell lines cultured as unicellular entities refer to cells which can be, or have been, 
used as recipients for recombinant vector or other transfer DNA, and include the progeny of the original cell which has 
been transfected. It is understood that the progeny of a single parental cell may not necessarily be completely identical 
in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate 
mutation. 

[0068] A "replicon" is any genetic element, e.g., a plasmid, a chromosome, a virus, a cosmid, etc. that behaves as 
an autonomous unit of polynucleotide replication within a cell; i.e., capable of replication under its own control. 
[0069] A "vector" is a replicon in which another poly nucleotide segment is attached, so as to bring about the replication 
and/or expression of the attached segment. 

[0070] "Control sequence" refers to polynucleotide sequences which are necessary to effect the expression of coding 
sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism; 
in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and terminators; in eukary- 
otes, generally, such control sequences include promoters, terminators and, in some instances, enhancers. The term 
"control sequences" is intended to include, at a minimum, all components whose presence is necessary for expression, 
and may also include additional components whose presence is advantageous, for example, leader sequences. 
[0071] "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship permit- 
ting them to function in their intended manner A control sequence "operably linked" to a coding sequence is ligated in 
such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences. 
[0072] An "open reading frame" (ORF) is a region of a polynucleotide sequence which encodes a polypeptide; this 
region may represent a portion of a coding sequence or a total coding sequence. 

[0073] A "coding sequence" is a polynucleotide sequence which is transcribed into mRNA and/or translated into a 
polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding se- 
quence are determined by a translation start codon at the 5-terminus and a translation stop codon at the 3'-terminus. 
A coding sequence can include, but is not limited to mRNA, cDNA, and recombinant polynucleotide sequences. 
[0074] "Immunologically identifiable with/as" refers to the presence of epitope(s) and polypeptides (s) which are also 
present in the designated potypeptide(s), usually HCV proteins. Immunological identity may be determined by antibody 
binding and/or competition in binding; these techniques are known to those of average skill in the an", and are also 
illustrated infra. 

[0075] As used herein, "epitope" refers to an antigenic determinant of a polypeptide; an epitope could comprise 3 
amino acids in a spatial conformation which is unique to the epitope, generally an epitope consists of at least 5 such 
amino acids, and more usually, consists of at least 8-10 such amino acids. Methods of determining the spatial confor- 
mation of amino acids are known in the art, and include, for example, x-ray crystallography and 2-dimensional nuclear 
magnetic resonance. 

[0076] A polypeptide is "immunologically reactive" with an antibody when it binds to an antibody due to antibody 
recognition of a specific epitope contained within the polypeptide. Immunological reactivity may be determined by 
antibody binding, more particularly by the kinetics of antibody binding, and/or by competition in binding using as com- 
pel itor(s) a known polypeptide(s) containing an epitope against which the antibody is directed. The techniques for 
determining whether a polypeptide is immunologically reactive with an antibody are known in the art. 
[0077] As used herein, the term "immunogenic polypeptide" is a polypeptide that elicits a cellular and/or humoral 
response, whether alone or linked to a carrier in the presence or absence of an adjuvant. 

[0078] The term "polypeptide" refers to a polymer of amino acids and does not refer to a specific length of the product; 
thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not 
refer to or exclude post-expression modifications of the polypeptide, for example, glycosylate ns, acetylations, phos- 
phorylations and the like. Included within the definition are, for example, polypeptides containing one or more analogs 
of an amino acid (including, for example, unnatural amino acids, etc.), polypeptides with substituted linkages, as well 
as other modifications known in the art, both naturally occurring and non-naturally occurring. 

[0079] "Transformation", as used herein, refers to the insertion of an exogenous polynucleotide into a host cell, 
irrespective of the method used for the insertion, for example, direct uptake, transduction, f-mating or electroporation. 
The exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, 
may be integrated into the host genome. 

[0080] "Treatment" as used herein refers to prophylaxis and/or therapy. 

[0081] An "individual", as used herein, refers to vertebrates, particularly members of the mammalian species, and 
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includes but is not limited to domestic animals, sports animals, and primates, including humans. 
[0082] As used herein, the "sense strand" of a nucleic acid contains the sequence that has sequence homology to 
that of mRNA. The "anti-sense strand" contains a sequence which is complementary to that of the "sense strand". 
[0083] As used herein, a "positive stranded genome" of a virus is one in which the genome, whether RNA or DNA, 
5 is single-stranded and which encodes a viral polypeptide(s). Examples of positive stranded RNA viruses include To- 
gaviridae, Coronaviridae, Retroviridae, Picornaviridae, and Caliciviridae. Included also, are the Flaviviridae, which were 
formerly classified as Togaviradae. See Fields & Knipe (1986). 

[0084] As used herein, "antibody-containing body component" refers to a component of an individual's body which 
is a source of the antibodies of interest. Antibody containing body components are known in the art, and include but 
'0 are not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external sections of the respiratory, intestinal, 
and genitourinary tracts, tears, saliva, milk, white blood cells, and myelomas. 

[0085] As used herein, "purified HCV" refers to a preparation of HCV which has been isolated from the cellular 
constituents with which the virus is normally associated, and from other types of viruses which may be present in the 
infected tissue. The techniques for isolating viruses are known to those of skill in the art, and include, for example, 

'5 centrrfugation and affinity chromatography; a method of preparing purified HCV is discussed infra. 

[0086] The term "HCV particles" as used herein include entire virion as well as particles which are intermediates in 
virion formation. HCV particles generally have one or more HCV proteins associated with the HCV nucleic acid. 
[0087] As used herein, the term "probe" refers to a polynucleotide which forms a hybrid structure with a sequence 
in a target region, due to complementarity of at least one sequence in the probe with a sequence in the target region. 

20 The probe, however, does not contain a sequence complementary to sequence(s) used to prime the polymerase chain 
reaction. 

[0088] As used herein, the term "target region" refers to a region of the nucleic acid which is to be amplified and/or 
detected. 

[0089] As used herein, the term "viral RNA", which includes HCV RNA, refers to RNA from the viral genome, frag- 

25 ments thereof, transcripts thereof, and mutant sequences derived therefrom. 

[0090] As used herein, a "biological sample" refers to a sample of tissue or flu id isolated from an individual, including 
but not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external sections of the skin, respiratory, 
intestinal, and genitourinary tracts, tears, saliva, milk, blood cells, tumors, organs, and also samples of in vitro cell 
culture constituents (including but not limited to conditioned medium resulting from the growth of cells in cell culture 

30 medium, putatively virally infected cells, recombinant cells, and cell components). 

II. Description of the Invention 

[0091] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of 

35 molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such tech- 
niques are explained fully in the literature. See e.g., Maniatis, Frtsch & Sambrook, MOLECULAR CLONING; A LAB- 
ORATORY MANUAL (1982); DNA CLONING, VOLUMES I AND II (D.N Glover ed. 1985); OLIGONUCLEOTIDE SYN- 
THESIS (M.J. Gait ed, 1 984); NUCLEIC ACID HYBRIDIZATION (B.D. Hames & S.J. Higgins eds. 1984); TRANSCRIP- 
TION AND TRANSLATION (B.D. Hames& S.J. Higginseds. 1984); ANIMAL CELL CULTURE (R.I. Freshney ed. 1986); 

40 IMMOBILIZED CELLS AND ENZYMES (IRL Press, 1 986); B. Perbal, A PRACTICAL GUIDE TO MOLECULAR CLON- 
ING (1984); the series, METHODS IN ENZYMOLOGY (Academic Press, Inc.); GENE TRANSFER VECTORS FOR 
MAMMALIAN CELLS (J.H. Miller and MP. Caloseds. 1987, Cold Spring Harbor Laboratory), Methods in Enzymology 
Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, eds., respectively), Mayer and Walker, eds. (1987), IMMUNO- 
CHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY (Academic Press, London), Scopes, (1 987), PROTEIN 

45 PURIFICATION: PRINCIPLES AND PRACTICE, Second Edition (Springe r-Verlag, N.Y), and HANDBOOK OF EX- 
PERIMENTAL IMMUNOLOGY, VOLUMES l-IV (D.M. Weir and C. C. Blackwell eds 1986). All patents, patent applica- 
tions, and publications mentioned herein, both supra and infra, are hereby incorporated herein by reference. 
[0092] The useful materials and processes of the present invention are made possible by the provision of a family 
of nucleotide sequences isolated from cDNA libraries which contain HCVcDNA sequences. These cDNA libraries were 

50 derived from nucleic acid sequences present in the plasma of an HCV-infected chimpanzee. The construction of one 
of these libraries, the "c" library (ATCC No. 40394), was reported in EPO Pub. No. 318,216. Several of the clones 
containing HCV cDNA reported herein were obtained from the "c" library. Although other clones reported herein were 
obtained from other HCV cDNA libraries, the presence of clones containing the sequences in the V library was con- 
firmed. As discussed in EPO Pub. No. 318,216, the family of HCV cDNA sequences isolated from the V library are 

55 not of human or chimpanzee origin, and show no significant homology to sequences contained within the HBV genome. 
[0093] The availability of the HCV cDNAs described herein permits the construction of polynucleotide probes which 
are reagents useful for detecting viral polynucleotides in biological samples, including donated blood. For example, 
from the sequences it is possible to synthesize DNA oligomers of about 8-10 nucleotides, or larger, which are useful 
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as hybridization probes to detect the presence of HCV RNA in, for example, donated blood, sera of subjects suspected 
of harboring the virus, or cell culture systems in which the virus is replicating. In addition, the cDNA sequences also 
allow the design and production of HCV specific polypeptides which are useful as diagnostic reagents for the presence 
of antibodies raised during HCV infection. Antibodies to purified polypeptides derived from the cDNAs may also be 

5 used to detect viral antigens in biological samples, including, for example, donated blood samples, sera from patients 
with NANBH, and in tissue culture systems being used for HCV replication. Moreover, the immunogenic polypeptides 
disclosed herein, which are encoded in portions of the ORF of HCV cDNA shown in Fig. 17, are also useful for HCV 
screening, diagnosis, and treatment, and for raising antibodies which are also useful for these purposes. 
[0094] In addition, the novel cDNA sequences described herein enable further characterization of the HCV genome. 

'0 Polynucleotide probes and primers derived from these sequences may be used to amplify sequences present in cDNA 
libraries, and/or to screen cDNA libraries for additional overlapping cDNA sequences, which, in turn, may be used to 
obtain more overlapping sequences. As indicated infra, and in EPO Pub. No. 318,216, the genome of HCV appears 
to be RNA comprised primarily of a large open reading frame (ORF) which encodes a large polyprotein. 
[0095] The HCV cDNA sequences provided herein, the polypeptides derived from these sequences, and the immu- 

'5 nogenic polypeptides described herein, as well as antibodies directed against these polypeptides are also useful in 
the isolation and identification of the blood-borne NABV (BB-NANBV) agent(s). For example, antibodies directed 
against HCV epitopes contained in polypeptides derived from the cDNAs may be used in processes based upon affinity 
chromatography to isolate the virus. Alternatively, the antibodies may be used to identify viral particles isolated by other 
techniques. The viral antigens and the genomic material within the isolated viral particles may then be further charac- 

20 terized. 

[0096] In addition to the above, the information provided infra allows the identification of additional HCV strains or 
isolates. The isolation and characterization of the additional HCV strains or isolates may be accomplished by isolating 
the nucleic acids from body components which contain viral particles and/or viral RNA, creating cDNA libraries using 
polynucleotide probes based on the HCV cDNA probes described infra., screening the libraries for clones containing 
25 HCV cDNA sequences described infra., and comparingthe HCVcDNAsfrom the new isolates with the cDNAs described 
infra. The polypeptides encoded therein, or in the viral genome, may be monitored for immunological cross-reactivity 
utilizing the polypeptides and antibodies described supra. Strains or isolates which fit within the parameters of HCV, 
as described in the Definitions section, supra., are readily identifiable. Other methods for identifying HCV strains will 
be obvious to those of skill in the art, based upon the information provided herein. 

30 

Isolation of the HCV cDNA Sequences 

[0097] The novel HCV cDNA sequences described infra, extend the sequence of the cDNA to the HCV genome 
reported in EPO Pub. No. 31 8,21 6. The sequences which are present in clones b11 4a, 1 8g, ag30a, CA205a, CA290a, 

35 CA216a, pi!4a, CA167b, CA156e, CA84a, and CA59a lie upstream of the reported sequence, and when compiled, 
yield nucleotides nos. -319 to 1348 of the composite HCV cDNA sequence. (The negative number on a nucleotide 
indicates its distance upstream of the nucleotide which starts the putative initiator MET codon.) The sequences which 
are present in clones b5a and 16jh lie downstream of the reported sequence, and yield nucleotides nos. 8659 to 8866 
of the composite sequence. The composite HCV cDNA sequence which includes the sequences in the aforementioned 

40 clones, is shown in Fig. 1 7. 

[0098] The novel HCV cDNAs described herein were isolated from a number of HCV cDNA libraries, including the 
V library present in lambda gtll (ATCC No. 40394). The HCV cDNA libraries were constructed using pooled serum 
from a chimpanzee with chronic HCV infection and containing a high titer of the virus, i.e., at least 10 6 chimp infectious 
doses/ml (CID/ml). The pooled serum was used to isolate viral particles; nucleic acids isolated from these particles 

45 was used as the template in the construction of cDNA libraries to the viral genome. The procedures for isolation of 
putative HCV particles and for constructing the "c" HCV cDNA library is described in EPO Pub. No. 318,216. Other 
methods for constructing HCV cDNA libraries are known in the art, and some of these methods are described infra., 
in the Examples. Isolation of the sequences was by screening the libraries using synthetic polynucleotide probes, the 
sequences of which were derived from the 5'-region and the 3'-region of the known HCV cDNA sequence. The de- 

50 scription of the method to retrive the cDNa sequences is mostly of historical interest. The resultant sequences (and 
their complements) are provided herein, and the sequences, or any portion thereof, could be prepared using synthetic 
methods, or by a combination of synthetic methods with retrieval of partial sequences using methods similar to those 
described herein. 

55 Preparation of Viral Polypeptides and Fragments 

[0099] The availability of HCV cDNA sequences, or nucleotide sequences derived therefrom (including segments 
and modifications of the sequence), permits the construction of expression vectors encoding antigenically active re- 
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gions of the polypeptide encoded in either strand. These antigenicity active regions may be derived from coat or 
envelope antigens or from core antigens, or from antigens which are non -structural including, for example, polynucle- 
otide binding proteins, polynucleotide poly me rase (s), and other viral proteins required for the replication and/or as- 
sembly of the virus particle. Fragments encoding the desired polypeptides are derived from the cDNA clones using 
conventional restriction digestion or by synthetic methods, and are tigated into vectors which may, for example, contain 
portions of fusion sequences such as beta-galactosidase or superoxide dismutase (SOD), preferably SOD. Methods 
and vectors which are useful for the production of polypeptides which contain fusion sequences of SOD are described 
in European Patent Office Publication number 0196056, published October 1, 1986. Vectors for the expression of 
fusion polypeptides of SOD and HCV polypeptides encoded in a number of HCV clones are described infra., in the 
Examples. Any desired portion of the HCV cDNA containing an open reading frame, in either sense strand, can be 
obtained as a recombinant polypeptide, such as a mature or fusion protein; alternatively, a polypeptide encoded in the 
cDNA can be provided by chemical synthesis. 

[0100] The DNA encoding the desired polypeptide, whether in fused or mature form, and whether or not containing 
a signal sequence to permit secretion, may be ligated into expression vectors suitable for any convenient host. Both 
eukaryotic and prokaryotic host systems are presently used in forming recombinant polypeptides, and a summary of 
some of the more common control systems and host cell lines is given infra. The polypeptide is then isolated from 
lysed cells or from the culture medium and purified to the extent needed for its intended use. Purification may be by 
techniques known in the art, for example, differential extraction, salt fractionation, chromatography on ion exchange 
resins, affinity chromatography, centrifugation, and the like. See, for example, Methods in Enzymology for a variety of 
methods for purifying proteins. Such polypeptides can be used as diagnostics, or those which give rise to neutralizing 
antibodies may be formulated into vaccines. Antibodies raised against these polypeptides can also be used as diag- 
nostics, or for passive immunotherapy. In addition, as discussed infra., antibodies to these polypeptides are useful for 
isolating and identifying HCV particles. 

Preparation of Antigenic Polypeptides and Conjugation with Carrier 

[01 01 ] An antigenic region of a polypeptide is generally relatively small-typically 8 to 1 0 amino acids or less in length. 
Fragments of as few as 5 amino acids may characterize an antigenic region. These segments may correspond to 
regions of HCV antigen. Accordingly, using the cDNAs of HCV as a basis, DNAs encoding short segments of HCV 
polypeptides can be expressed recombinantly either as fusion proteins, or as isolated polypeptides. In addition, short 
amino acid sequences can be conveniently obtained by chemical synthesis. In instances wherein the synthesized 
polypeptide is correctly configured so as to provide the correct epitope, but is too small to be immunogenic, the polypep- 
tide may be linked to a suitable carrier. 

[0102] A number of techniques for obtaining such linkage are known in the art, including the formation of disulfide 
linkages using N-succinimidyl-3-(2-pyridylthio)propionate (SPDP) and succinimidyl 4-(N-maleimidomethyl)cyclohex- 
ane-1 -carboxylate (SMCC) obtained from Pierce Company, Rockford, Illinois, (if the peptide lacks a sulfhydryl group, 
this can be provided by addition of a cysteine residue.) These reagents create a disulfide linkage between themselves 
and peptide cysteine residues on one protein and an amide linkage through the epsilon-amino on a lysine, or other 
free amino group in the other. A variety of such disulfide/amide-forming agents are known. See, for example, Immun. 
Rev. (1982) 62:185. Other b [functional coupling agents form athioether rather than a disulfide linkage. Many of these 
thio-ether-forming agents are commercially available and include reactive esters of 6-maleimidocaproic acid, 2-bro- 
moacetic acid, 2-iodoacetic acid, 4-(N-maleimido-methyl)cyclohexane-1-carboxylic acid, and the like. The carboxyl 
groups can be activated by combining them with succinimide or 1-hydroxyl-2-nitro-4-sulfonic acid, sodium salt. Addi- 
tional methods of coupling antigens employs the rotavirus/'binding peptide" system described in EPO Pub. No. 259,1 49, 
the disclosure of which is incorporated herein by reference. The foregoing list is not meant to be exhaustive, and 
modifications of the named compounds can clearly be used. 

[0103] Any carrier may be used which does not itself induce the production of antibodies harmful to the host. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins; polysaccharides, such as latex func- 
tionalized sepharose, agarose, cellulose, cellulose beads and the like; polymeric amino acids, such as polyglutamic 
acid, polylysine, and the like; amino acid copolymers; and inactive virus particles. Especially useful protein substrates 
are serum albumins, keyhole limpet hemocyanin, immunoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, 
and other proteins well known to those skilled in the art. 

[0104] In addition to full-length viral proteins, polypeptides comprising truncated HCV amino acid sequences encod- 
ing at least one viral epitope are useful immunological reagents. For example, polypeptides comprising such truncated 
sequences can be used as reagents in an immunoassay. These polypeptides also are candidate sub unit antigens in 
compositions for antiserum production or vaccines. While these truncated sequences can be produced by various 
known treatments of native viral protein, it is generally preferred to make synthetic or recombinant polypeptides com- 
prising an HCV sequence. Polypeptides comprising these truncated HCV sequences can be made up entirely of HCV 
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sequences (one or more epitopes, either contiguous or noncontiguous), or HCV sequences and heterologous sequenc- 
es in a fusion protein. Useful heterologous sequences include sequences that provide for secretion from a recombinant 
host, enhance the immunological reactivity of the HCV epitope(s), or facilitate the coupling of the polypeptide to an 
immunoassay support or a vaccine carrier. See, e.g., EPO Pub. No. 116,201 ; U.S. Pat. No. 4,722,840; EPO Pub. No. 
259,149; U.S. Pat. No. 4,629,783, the disclosures of which are incorporated herein by reference. 
[0105] The size of polypeptides comprising the truncated HCV sequences can vary widely, the minimum size being 
a sequence of sufficient size to provide an HCV epitope, while the maximum size is not critical. For convenience, the 
maximum size usually is not substantially greater than that required to provide the desired HCV epitopes and function 
(s) of the heterologous sequence, if any. Typically, the truncated HCV amino acid sequence will range from about 5 to 
about 100 amino acids in length. More typically, however, the HCV sequence will be a maximum of about 50 amino 
acids in length, preferably a maximum of about 30 amino acids. It is usually desirable to select HCV sequences of at 
least about 10, 12 or 15 amino acids, up to a maximum of about 20 or 25 amino acids. 

[0106] Truncated HCV amino acid sequences comprising epitopes can be identified in a number of ways. For ex- 
ample, the entire viral protein sequence can be screened by preparing a series of short peptides that together span 
the entire protein sequence. An example of antigenic screening of the regions of the HCV polyprotein is shown infra. 
In addition, by starting with, for example, 100mer polypeptides, it would be routine to test each polypeptide for the 
presence of epitope(s) showing a desired reactivity, and then testing progressively smaller and overlapping fragments 
from an identified 100mer to map the epitope of interest. Screening such peptides in an immunoassay is within the 
skill of the art. It is also known to carry out a computer analysis of a protein sequence to identify potential epitopes, 
and then prepare oligopeptides comprising the identified regions for screening. Such a computer analysis of the HCV 
amino acid sequence is shown in Fig. 20, where the hydrophilic/hydrophobic character is displayed above the antigen 
index. The amino acids are numbered from the starting MET (position 1 ) as shown in Fig. 1 7. It is appreciated by those 
of skill in the art that such computer analysis of antigenicity does not always identify an epitope that actually exists, 
and can also incorrectly identify a region of the protein as containing an epitope. 

[0107] Examples of HCV amino acid sequences that may be useful, which are expressed from expression vectors 
comprised of clones 5-1 -1 , 81 , CA74a, 35f, 279a, C36, C33b, CA290a, C8f, C1 2f, 14c, 15e, C25c, C33c, C33f, 33g, 
C39c, C40b, CA167b are described infra. Other examples of HCV amino acid sequences that may be useful as de- 
scribed herein are set forth below. It is to be understood that these peptides do not necessarily precisely map one 
epitope, and may also contain HCV sequence that is not immunogenic. These non-immunogenic portions of the se- 
quence can be defined as described above using conventional techniques and deleted from the described sequences. 
Further, additional truncated HCV amino acid sequences that comprise an epitope or are immunogenic can be identified 
as described above. The following sequences are given by amino acid number (i.e., "AAn") where n is the amino acid 
number as shown in Fig. 1 7: 

AA1-AA25; AA1-AA50; AA1-AA84; AA9-AA177; AA1-AA10; AA5-AA20; AA20-AA25; AA35-AA45; AA50-AA100 
AA40-AA90; AA45-AA65; AA65-AA75; AA80-90; AA99-AA120; AA95-AA110; AA105-AA120; AA100-AA150 

AA155-AA170; AA190-AA210; AA200-AA250; AA220-AA240; AA245-AA265; AA250-AA300 


AA150-AA200 
AA290-AA330 
AA400-AA450 
AA460-AA470 
AA575-AA605 
AA700-AA750 
AA825-AA850 
AA970-AA990 
AA1040-AA1055 
AA1192-AA1457 
AA1260-AA1280 
AA1345-AA1365 
AA1460-AA1475 
AA1550-AA1600 
AA161 0-AA1645 
AA1720-AA1745 
AA1900-AA1950 
AA1950-AA1985 
AA2045-AA2070 
AA2200-AA2325 
AA2300-AA2350 
AA2345-AA2415 
AA241 5-AA2450 


AA290-305; AA300-AA350; AA310-AA330; AA350-AA400; 


AA405-AA415 
AA475-AA495 
AA585-AA600 
AA700-AA725 


AA415-AA425; AA425-AA435; AA437-AA582 

AA500-AA550; AA511-AA690; AA515-AA550 

AA600-AA650; AA600-AA625; AA635-AA665 

AA700-AA750; AA725-AA775; AA770-AA790 


AA380-AA395; AA405-AA495 

AA450-AA500; AA440-AA460 

AA550-AA600; AA550-AA625 

AA650-AA700; AA645-AA680 

AA750-AA800; AA800-AA815 


AA850-AA875; AA800-AA850; AA920-AA990; AA850-AA900; AA920-AA945; AA940-AA965 


AA950-AA1 000; AA1 000-AA1 060; AA1 000-AA1 025; 


AA1075-AA1175 
AA1195-AA1250 
AA1266-AA1428 
AA1350-AA1400 
AA1475-AA1515 
AA1545-AA1560 
AA1650-AA1690 
AA1745-AA1770 
AA1900-AA1920 
AA1980-AA2000 
AA2054-AA2223 
AA2250-AA2330 
AA2290-AA2310 
AA2345-AA2375 
AA2445-AA2500 


AA1050-AA1200 

AA1200-AA1225 

AA1300-AA1350 

AA1365-AA1380 

AA1475-AA1500 

AA1569-AA1931 

AA1685-AA1770 

AA1750-AA1800 

AA1916-AA2021 

AA2000-AA2050 

AA2070-AA2100 

AA2255-AA2270 

AA2310-AA2330 

AA2370-AA2410 

AA2445-AA2475 


AA1070-AA1100 
AA1225-AA1250 
AA1290-AA1310 
AA1380-AA1405 
AA1500-AA1550 
AA1570-AA1590 
AA1689-AA1805 
AA1775-AA1810 
AA1920-AA1940 
AA2005-AA2025 
AA2100-AA2150 
AA2265-AA2280 
AA2330-AA2350 
AA2371-AA2502 
AA2470-AA2490 


AA1000-AA1050; 
AA1100-AA1130 
AA1250-AA1300 
AA1310-AA1340 
AA1400-AA1450 
AA1500-AA1515 
AA1 595-AA1610 
AA1690-AA1720 
AA1795-AA1850 
AA1949-AA2124 
AA2020-AA2045 
AA2150-AA2200 
AA2280-AA2290 
AA2350-AA2400 
AA2400-AA2450 
AA2500-AA2550; 


AA1025-AA1040 
AA1140-AA1165 
AA1260-AA1310 
AA1345-AA1405 
AA1450-AA1500 
AA1515-AA1550 
AA1590-AA1650 
AA1694-AA1735 
AA1 850-AA1900 
AA1950-AA2000 
AA2045-AA21 00 
AA2200-AA2250 
AA2287-AA2385 
AA2348-AA2464 
AA2400-AA2425 
AA2505-AA2540 
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AA2535-AA2560; AA2550-AA2600; AA2560-AA2580; AA2600-AA2650; AA2605-AA2620; AA2620-AA2650; 
AA2640-AA2660; AA2650-AA2700; AA2655-AA2670; AA2670-AA2700; AA2700-AA2750; AA2740-AA2760; 
AA2750-AA2800; AA2755-AA2780; 

AA2780-AA2830; AA2785-AA2810; AA2796-AA2886; AA2810-AA2825; AA2800-AA2850; AA2850-AA2900; 
AA2850-AA2865; AA2885-AA2905; AA2900-AA2950; AA2910-AA2930; AA2925-AA2950; AA2945-end(C terminal). 
The above HCV amino acid sequences can be prepared as discrete peptides or incorporated into a larger polypeptide, 
and may find use as described herein. Additional polypeptides comprising truncated HCV sequences are described in 
the examples. 

[0108] The observed relationship of the putative polyproteins of HCV and the Flaviviruses allows some prediction of 
the putative domains of the HCV "non-structural" (NS) proteins. The locations of the individual NS proteins in the 
putative Flavivirus precursor polyp rote in are fairly well-known. Moreover, these also coincide with observed gross 
fluctuations in the hydrophobic^ profile of the polyprotein. It is established that NS5 of Flaviviruses encodes the virion 
polymerase, and that NS1 corresponds with a complement fixation antigen which has been shown to be an effective 
vaccine in animals. Recently, it has been shown that a flavrviral protease function resides in NS3. Due to the observed 
similarities betwenHCVandthe Flaviviruses, described infra., deductions concerning the approximate locations of the 
corresponding protein domains and functions in the HCV polyprotein are possible. The expression of polypeptides 
containing these domains in a variety of recombinant host cells, including, for example, bacteria, yeast, insect, and 
vertebrate cells, should give rise to important immunological reagents which can be used for diagnosis, detection, and 
vaccines. 

[0109] Although the non-structural protein regions of the putative polyproteins of the HCV isolate described herein 
and of Flaviviruses appear to have some similarity, there is less similarity between the putative structural regions which 
are towards the N-terminus. In this region, there is a greater divergence in sequence, and in addition, the hydrophobic 
profile of the two regions show less similarity. This "divergence" begins in the N-terminal region of the putative NS1 
domain in HCV, and extends to the presumed N-terminus. Nevertheless, it may still be possible to predict the approx- 
imate locations of the putative nucleocapsid (N-terminal basic domain) and E (generally hydrophobic) domains within 
the HCV polyprotein. In the Examples the predictions are based on the changes observed in the hydrophobic profile 
of the HCV polyprotein, and on a knowledge of the location and character of the flaviviral proteins. From these predic- 
tions it may be possible to identify approximate regions of the HCV polyprotein that could correspond with useful 
immunological reagents. For example, the E and NS1 proteins of Flaviviruses are known to have efficacy as protective 
vaccines. These regions, as well as some which are shown to be antigenic in the HCV isolate described herein, for 
example those within putative NS3, C, and NS5, etc., should also provide diagnostic reagents. Moreover, the location 
and expression of viral-encoded enzymes may also allow the evaluation of anti-viral enzyme inhibitors, i.e., for example, 
inhibitors which prevent enzyme activity by virtue of an interaction with the enzyme itself, or substances which may 
prevent expression of the enzyme, (for example, anti-sense RNA, or other drugs which interfere with expression). 

Preparation of Hybrid Particle Immunogens Containing HCV Fpftppes 

[0110] The immunogenicity of the epitopes of HCV may also be enhanced by preparing them in mammalian or yeast 
systems fused with or assembled with particle-forming proteins such as, for example, that associated with hepatitis B 
surface antigen. Constructs wherein the NANBV epitope is linked directly to the particle-forming protein coding se- 
quences produce hybrids which are immunogenic with respect to the HCV epitope. In addition, all of the vectors pre- 
pared include epitopes specific to HBV, having various degrees of immunogenicity, such as, for example, the pre-S 
peptide. Thus, particles constructed from particle forming protein which include HCV sequences are immunogenic with 
respect to HCV and HBV. 

[0111] Hepatitis surface antigen (HBSAg) has been shown to be formed and assembled into particles in S. cerevisiae 
(Valenzuela et al. (1 982)), as well as in, for example, mammalian cells (Valenzuela, P., et al. (1 984)). The formation of 
such particles has been shown to enhance the immunogenicity of the monomer subunit. The constructs may also 
include the immunodominant epitope of HBSAg, comprising the 55 amino acids of the presurface (pre-S) region. 
Neurath et al. (1984). Constructs of the pre-S-HBSAg particle expressible in yeast are disclosed in EPO 174,444, 
published March 19, 1 986; hybrids including heterologous viral sequences for yeast expression are disclosed in EPO 
175,261 , published March 26, 1966. These constructs may also be expressed in mammalian cells such as Chinese 
hamster ovary (CHO) cells using an SV40-dihydrofolate reductase vector (Michelle et al. (1984)). 
[0112] In addition, portions of the particle-forming protein coding sequence may be replaced with codons encoding 
an HCV epitope. In this replacement, regions which are not required to mediate the aggregation of the units to form 
immunogenic particles in yeast or mammals can be deleted, thus eliminating additional HBV antigenic sites from com- 
petition with the HCV epitope. 
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Preparation of Vaccines 

[0113] Vaccines may be prepared from one or more immunogenic polypeptides derived from HCV cDNA, including 
the cDNA sequences described in the Examples. The observed homology between HCV and Flaviviruses provides 

5 information concerning the polypeptides which may be most effective as vaccines, as well as the regions of the genome 
in which they are encoded. The general structure of the Flavivirus genome is discussed in Rice et al (1986). The 
flavivirus genomic RNA is believed to be the only virus-specific mRNA species, and it is translated into the three viral 
structural proteins, i.e., C, M, and E, as well as two large nonstructural proteins, NS4 and NS5, and a complex set of 
smaller nonstructural proteins. It is known that major neutralizing epitopes for Flaviviruses reside in the E (envelope) 

io protein (Roehrig (1986)). Thus, vaccines may be comprised of recombinant polypeptides containing epitopes of HCV 
E. These polypeptides may be expressed in bacteria, yeast, or mammalian cells, or alternatively may be isolated from 
viral preparations. It is also anticipated that the other structural proteins may also contain epitopes which give rise to 
protective anti-HCV antibodies. Thus, polypeptides containing the epitopes of E, C, and M may also be used, whether 
singly or in combination, in HCV vaccines. 

'5 [0114] In addition to the above, it has been shown that immunization with NS1 (nonstructural protein 1), results in 
protection against yellow fever (Schlesinger et al (1 986)). This is true even though the immunization does not give rise 
to neutralizing antibodies. Thus, particularly since this protein appears to be highly conserved among Flaviviruses, it 
is likely that HCV NS1 will also be protective against HCV infection. Moreover, it also shows that nonstructural proteins 
may provide protection against viral pathogenicity, even if they do not cause the production of neutralizing antibodies. 

20 [0115] The information provided in the Examples concerning the immunogenicity of the polypeptides expressed from 
cloned HCV cDNAs which span the various regions of the HCV ORF also allows predictions concerning their use in 
vaccines. 

[0116] In view of the above, multivalent vaccines against HCV may be comprised of one or more epitopes from one 
or more structural proteins, and/or one or more epitopes from one or more nonstructural proteins. These vaccines may 
25 be comprised of, for example, recombinant HCV polypeptides and/or polypeptides isolated from the virions. In partic- 
ular, vaccines are contemplated comprising one or more of the following HCV proteins, or subunit antigens derived 
therefrom: E, NS1, C, NS2, NS3, NS4 and NS5. Particularly preferred are vaccines comprising E and/or NS1, or 
subunrts thereof. 

[0117] The preparation of vaccines which contain an immunogenic polypeptide(s) as active ingredients, is known to 
30 one skilled in the art. Typically, such vaccines are prepared as injectables, either as liquid solutions or suspensions; 
solid forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation 
may also be emulsified, or the protein encapsulated in liposomes. The active immunogenic ingredients are often mixed 
with excipients which are pharmaceutical^ acceptable and compatible with the active ingredient. Suitable excipients 
are, for example, water, saline, dextrose, glycerol, ethanol, or the like and combinations thereof. In addition, if desired, 
35 the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering 
agents, and/oradjuvants which enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective 
include but are not limited to: aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl- 
nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutami- 
nyl-L-alanine^-tl'^^dipalmitoyl-sn-glycero-S-hydroxyphosphoryloxyJ-ethylamine (CGP 19835A, referred to as MTP- 
40 PE), and RIBI, which contains three components extracted from bacteria, monophosphory I lipid A, trehalose dimycolate 
and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. The effectiveness of an adjuvant may 
be determined by measuring the amount of antibodies directed against an immunogenic polypeptide containing an 
HCV antigenic sequence resulting from administration of this polypeptide in vaccines which are also comprised of the 
various adjuvants. 

45 [0118] The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously 
or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories 
and, in some cases, oral formulations. For suppositories, traditional binders and carriers may include, for example, 
potyalkylene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient 
in the range of 0.5% to 10%, preferably 1%-2%. Oral formulations include such normally employed excipients as, for 

50 example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, mag- 
nesium carbonate, and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, 
sustained release formulations or powders and contain 10%-95% of active ingredient, preferably 25%-70%. 
[0119] The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutical^ acceptable salts 
include the acid addition salts (formed with free amino groups of the peptide) and which are formed with inorganic 

55 acids such as, for example, hydrochloric or phosphoric acids, or such organic acids such as acetic, oxalic, tartaric, 
maleic, and the like. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for 
example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, 
trimethylamine, 2-ethylamino ethanol, histidine, procaine, and the like. 
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Dosage and Administration of Vaccines 

[0120] The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as 
will be prophylactically and/or therapeutically effective. The quantity to be administered, which is generally in the range 

5 of 5 micrograms to 250 micrograms of antigen per dose, depends on the subject to be treated, capacity of the subject's 
immune system to synthesize antibodies, and the degree of protection desired. Precise amounts of active ingredient 
required to be administered may depend on the judgment of the practitioner and may be peculiar to each subject. 
[0121] The vaccine may be given in a single dose schedule, or preferably in a multiple dose schedule. A multiple 
dose schedule is one in which a primary course of vaccination may be with 1-10 separate doses, followed by other 

>o doses given at subsequent time intervals required to maintain and or reenforce the immune response, for example, at 
1-4 months for a second dose, and if needed, a subsequent dose(s) after several months. The dosage regimen will 
also, at least in part, be determined by the need of the individual and be dependent upon the judgment of the practitioner. 
[0122] In addition, the vaccine containing the immunogenic HCV antigen(s) may be administered in conjunction with 
other immunoregulatory agents, for example, immune globulins. 

15 

Preparation of Antibodies Against HCV Epitopes 

[0123] The immunogenic polypeptides prepared as described above are used to produce antibodies, both polyclonal 
and monoclonal. If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, goat, horse, etc.) is 

20 immunized with an immunogenic polypeptide bearing an HCV epitope(s). Serum from the immunized animal is collected 
and treated according to known procedures. If serum containing polyclonal antibodies to an HCV epitope contains 
antibodies to other antigens, the polyclonal antibodies can be purified by immunoaffinity chromatography. Techniques 
for producing and processing polyclonal antisera are known in the art, see for example, Mayer and Walker (1987). 
[0124] Alternatively, polyclonal antibodies may be isolated from a mammal which has been previously infected with 

25 HCV. An example of a method for purifying antibodies to HCV epitopes from serum from an infected individual, based 
upon affinity chromatography and utilizing a fusion polypeptide of SOD and a polypeptide encoded within cDNA clone 
5-1-1, is presented in EPO Pub. No. 318,216. 

[0125] Monoclonal antibodies directed against HCV epitopes can also be readily produced by one skilled in the art. 

The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibody-producing 
30 cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes 

with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., M. Schreier et al. (1980); Hammerling et al. 

(1981); Kennett et al. (1980); see also, U.S. Patent Nos. 4,341,761; 4,399.121; 4,427,783; 4,444,887; 4,466,917; 

4,472,500; 4,491,632; and 4,493,890. Panels of monoclonal antibodies produced against HCV epitopes can be 

screened for various properties; i.e., for isotype, epitope affinity, etc. 
35 [0126] Antibodies, both monoclonal and polyclonal, which are directed against HCV epitopes are particularly useful 

in diagnosis, and those which are neutralizing are useful in passive immunotherapy. Monoclonal antibodies, in partic- 
ular, may be used to raise anti-idiotype antibodies. 

[0127] Anti-idiotype antibodies are immunoglobulins which carry an "internal image" of the antigen of the infectious 
agent against which protection is desired. See, for example, Ni son off, A., et al. (1981 ) and Dreesman et al. (1985). 

40 [0128] Techniques for raising anti-idiotype antibodies are known in the art. See, for example, Grzych (1985), Mac- 
Namara et al. (1984), and Uytdehaag et al. (1985). These anti-idiotype antibodies may also be useful for treatment 
and/or diagnosis of NANBH, as well as for an elucidation of the immunogenic regions of HCV antigens. 
[0129] It would also be recognized by one of ordinary skill in the art that a variety of types of antibodies directed 
against HCV epitopes may be produced. As used he rein, the term "antibody" refers to a polypeptide or group of poly pep- 

45 tides which are comprised of at least one antibody combining site. An "antibody combining site" or "binding domain" 
is formed from the folding of variable domains of an antibody molecule(s) to form three-dimensional binding spaces 
with an internal surface shape and charge distribution complementary to the features of an epitope of an antigen, which 
allows an immunological reaction with the antigen. An antibody combining site may be formed from a heavy and/or a 
light chain domain (VH and VL, respectively), which form hypervariable loops which contribute to antigen binding. The 

50 term "antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, altered antibodies, 
univalent antibodies, the Fab proteins, and single domain antibodies. 

[0130] A "single domain antibody" (dAb) is an antibody which is comprised of an VH domain, which reacts immuno- 
logically with a designated antigen. A dAB does not contain a VL domain, but may contain other antigen binding domains 
known to exist in antibodies, for example, the kappa and lambda domains. Methods for preparing dABs are known in 
55 the art. See, for example, Ward et al. (1 989). 

[0131] Antibodies may also be comprised of VH and VL domains, as well as other known antigen binding domains. 
Examples of these types of antibodies and methods for their preparation are known in the art (see, e.g., U.S. Patent 
No. 4,816,467, which is incorporated herein by reference), and include the following. For example, "vertebrate anti- 
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bodies" refers to antibodies which are tetramers or aggregates thereof, comprising light and heavy chains which are 
usually aggregated in a "Y" configuration and which may or may not have covalent linkages between the chains. In 
vertebrate antibodies, the amino acid sequences of all the chains of a particular antibody are homologous with the 
chains found in one antibody produced by the lymphocyte which produces that antibody in situ, or in vitro (for example, 

5 in hybridomas). Vertebrate antibodies typicality include native antibodies, for example, purified polyclonal antibodies 
and monoclonal antibodies. Examples of the methods for the preparation of these antibodies are described infra. 
[0132] "Hybrid antibodies" are antibodies wherein one pair of heavy and light chains is homologous to those in a first 
antibody, while the other pair of heavy and light chains is homologous to those in a different second antibody. Typically, 
each of these two pairs will bind different epitopes, particularly on different antigens. This results in the property of 

'0 "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids may also be formed using chimeric 
chains, as set forth below. 

[01 33] "Chimeric antibodies", are antibodies in which the heavy and/or light chains are fusion proteins. Typically the 
constant domain of the chains is from one particular species and/or class, and the variable domains are from a different 
species and/or class. Also included is any antibody in which either or both of the heavy or light chains are composed 

'5 of combinations of sequences mimicking the sequences in antibodies of different sources, whether these sources be 
differing classes, or different species of origin, and whether or not the fusion point is at the variable/constant boundary. 
Thus, it is possible to produce antibodies in which neither the constant nor the variable region mimic known antibody 
sequences. It then becomes possible, for example, to construct antibodies whose variable region has a higher specific 
affinity for a particular antigen, or whose constant region can elicit enhanced complement fixation, or to make other 

20 improvements in properties possessed by a particular constant region. 

[0134] Another example is "altered antibodies", which refers to antibodies in which the naturally occurring amino 
acid sequence in a vertebrate antibody has been varied. Utilizing recombinant DNA techniques, antibodies can be 
redesigned to obtain desired characteristics. The possible variations are many, and range from the changing of one 
or more amino acids to the complete redesign of a region, for example, the constant region. Changes in the constant 

25 region, in general, to attain desired cellular process characteristics, e.g., changes in complement fixation, interaction 
with membranes, and other effector functions. Changes in the variable region may be made to alter antigen binding 
characeristics. The antibody may also be engineered to aid the specific delivery of a molecule or substance to a specific 
cell or tissue site. The desired alterations may be made by known techniques in molecular biology, e.g., recombinant 
techniques, site directed mutagenesis, etc. 

30 [0135] Yet another example are "univalent antibodies' 4 , which are aggregates comprised of a heavy chain/light chain 
dimer bound to the Fc (i.e., constant) region of a second heavy chain. This type of antibody escapes antigenic modu- 
lation. See, e.g., Glennie et al. (1982). 

[0136] Included also within the definition of antibodies are "Fab" fragments of antibodies. The "Fab w region refers to 
those portions of the heavy and light chains which are roughly equivalent, or analogous, to the sequences which 

35 comprise the branch portion of the heavy and light chains, and which have been shown to exhibit immunological binding 
to a specified antigen, but which lack the effector Fc portion . "Fab" includes aggregates of one heavy and one light 
chain (commonly known as Fab'), as well as tetramers containing the 2H and 2L chains (referred to as Ffab)^, which 
are capable of selectively reacting with a designated antigen or antigen family. "Fab" antibodies may be divided into 
subsets analogous to those described above, i.e, "vertebrate Fab", "hybrid Fab", "chimeric Fab", and "altered Fab". 

40 Methods of producing "Fab" fragments of antibodies are known within the art and include, for example, proteolysis, 
and synthesis by recombinant techniques. 

It.H. Diagnostic Oligonucleotide Probes and Kits 

45 [0137] Using the disclosed portions of the isolated HCV cDNAs as a basts, oligomers of approximately 8 nucleotides 
or more can be prepared, either by excision or synthetically, which hybridize with the HCV genome and are useful in 
identification of the viral agent(s), further characterization of the viral genome(s), as well as in detection of the virus 
(es) in diseased individuals. The probes for HCV polynucleotides (natural or derived) are a length which allows the 
detection of unique viral sequences by hybridization. While 6-8 nucleotides may be a workable length, sequences of 

50 10-12 nucleotides are preferred, and about 20 nucleotides appears optimal. Preferably, these sequences will derive 
from regions which lack heterogeneity. These probes can be prepared using routine methods, including automated 
oligonucleotide synthetic methods. Among useful probes, for example, are those derived from the newly isolated clones 
disclosed herein, as well as the various oligomers useful in probing cDNA libraries, set forth below. A complement to 
any unique portion of the HCV genome will be satisfactory. For use as probes, complete complementarity is desirable, 

55 though it may be unnecessary as the length of the fragment is increased. 

[0138] For use of such probes as diagnostics, the biological sample to be analyzed, such as blood or serum, may 
be treated, if desired, to extract the nucleic acids contained therein. The resulting nucleic acid from the sample may 
be subjected to gel electrophoresis or other size separation techniques; alternatively, the nucleic acid sample may be 
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dot blotted without size separation. The probes are then labeled. Suitable labels, and methods for labeling probes are 
known in the art, and include, for example, radioactive labels incorporated by nick translation or kinasing, biotin, fluo- 
rescent probes, and chemiluminescent probes. The nucleic acids extracted from the sample are then treated with the 
labeled probe under hybridization conditions of suitable stringencies, and polynucleotide duplexes containing the probe 
5 are detected. 

[0139] The probes can be made completely complementary to the HCV genome. Therefore, usually high stringency 
conditions are desirable in order to prevent false positives. However, conditions of high stringency should only be used 
if the probes are complementary to regions of the viral genome which lack heterogeneity. The stringency of hybridization 
is determined by a number of factors during hybridization and during the washing procedure, including temperature, 
10 ionic strength, length of time, and concentration of formamide. These factors are outlined in, for example, Maniatis, T 
(1982). 

[0140] Generally, it is expected that the HCV genome sequences will be present in serum of infected individuals at 
relatively low levels, i.e., at approximately 102-10 3 chimp infectious doses (CID) per ml. This level may require that 
amplification techniques be used in hybridization assays. Such techniques are known in the art. For example, the Enzo 

>5 Biochemical Corporation "Bio-Bridge" system uses terminal deoxynucleotide transferase to add unmodified 3'-poly- 
dT-tails to a DNA probe. The poly dT-tailed probe is hybridized to the target nucleotide sequence, and then to a brotin- 
modified poly-A. PCT application 84/03520 and EPA124221 describe a DNA hybridization assay in which: (1) analyte 
is annealed to a single-stranded DNA probe that is complementary to an enzyme-labeled oligonucleotide; and (2) the 
resulting tailed duplex is hybridized to an enzyme-labeled oligonucleotide. EPA 204510 describes a DNA hybridization 

20 assay in which analyte DNA is contacted with a probe that has a tail, such as a poly-aT tail, an amplifier strand that 
has a sequence that hybridizes to the tail of the probe, such as a poly-A sequence, and which is capable of binding a 
plurality of labeled strands. A particularly desirable technique may first involve amplification of the target HCV sequenc- 
es in sera approximately 1 0,000 fold, i.e., to approximately 1 0 6 sequences/ml. This may be accomplished, for example, 
by the polymerase chain reactions (PCR) technique described which is by Saiki et al. (1986), by Mullis, U.S. Patent 

25 No. 4,683,1 95, and by Mullis et al. U.S. Patent No. 4,683,202. The amplified sequence(s) may then be detected using 
a hybridization assay which is described in EP 317,077, published May 24, 1989. These hybridization assays, which 
should detect sequences at the level of lO^/ml, utilize nucleic acid multimers which bind to single-stranded analyte 
nucleic acid, and which also bind to a multiplicity of single-stranded labeled oligonucleotides. A suitable solution phase 
sandwich assay which may be used with labeled polynucleotide probes, and the methods for the preparation of probes 

30 j s described in EPO 225,807, published June 1 6, 1 987. 

[0141] The probes can be packaged into diagnostic kits. Diagnostic kits include the probe DNA, which may be labeled; 
alternatively, the probe DNA may be unlabeled and the ingredients for labeling may be included in the kit in separate 
containers. The kit may also contain other suitably packaged reagents and materials needed for the particular hybrid- 
ization protocol, for example, standards, as well as instructions for conducting the test. 

35 

Immunoassay and Diagnostic Kits 

[0142] Both the polypeptides which react immunologically with serum containing HCV antibodies, for example, those 
detected by the antigenic screening method described infra, in the Examples, as well those derived from or encoded 

40 within the isolated clones described in the Examples, and composites thereof, and the antibodies raised against the 
HCV specific epitopes in these polypeptides, are useful in immunoassays to detect presence of HCV antibodies, or 
the presence of the virus and/or viral antigens, in biological samples. Design of the immunoassays is subject to a great 
deal of variation, and a variety of these are known in the art. For example, the immunoassay may utilize one viral 
epitope; alternatively, the immunoassay may use a combination of viral epitopes derived from these sources; these 

^5 epitopes may be derived from the same or from different viral polypeptides, and may be in separate recombinant or 
natural polypeptides, or together in the same recombinant polypeptides. It may use, for example, a monoclonal antibody 
directed towards a viral eprtope(s), a combination of monoclonal antibodies directed towards epitopes of one viral 
antigen, monoclonal antibodies directed towards epitopes of different viral antigens, polyclonal antibodies directed 
towards the same viral antigen, or polyclonal antibodies directed towards different viral antigens. Protocols may be 

so based, for example, upon competition, or direct reaction, or sandwich type assays. Protocols may also, for example, 
use solid supports, or may be by immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; 
the labels may be, for example, fluorescent, chemiluminescent, radioactive, or dye molecules. Assays which amplify 
the signals from the probe are also known; examples of which are assays which utilize biotin and avidin, and enzyme- 
labeled and mediated immunoassays, such as EUSA assays. 

55 [0143] Some of the antigenic regions of the putative polyprotein have been mapped and identified by screening the 
antigenicitiy of bacterial expression products of HCV cDNAs which encode portions of the polyprotein. See the Exam- 
ples. Other antigenic regions of HCV may be detected by expressing the portions of the HCV cDNAs in other expression 
systems, including yeast systems and cellular systems derived from insects and vertebrates. In addition, studies giving 
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rise to an antigenicity index and hydrophobicity/hydrophilicity profile give rise to information concerning the probability 
of a region's antigenicity. 

[0144] The studies on antigenic mapping by expression of HCV cDNAs showed that a number of clones containing 
these cDNAs expressed polypeptides which were immunologically reactive with serum from individuals with NANBH. 

5 No single polypeptide was immunologically reactive with all sera. Five of these polypeptides were very immunogenic 
in that antibodies to the HCV epitopes in these polypeptides were detected in many different patient sera, although 
the overlap in detection was not complete. Thus, the results on the immunogenicity of the polypeptides encoded in the 
various clones suggest that effecient detection systems may include the use of panels of epitopes. The epitopes in the 
panel may be constructed into one or multiple polypeptides. 

w [0145] Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are constructed by pack- 
aging the appropriate materials, including the polypeptides of the invention containing HCV epitopes or antibodies 
directed against HCV epitopes in suitable containers, along with the remaining reagents and materials required for the 
conduct of the assay, as well as a suitable set of assay instructions. 

15 Further Characterization of the HCV Genome, Virions, and Viral Antigens Using Probes Derived From cDNA to the 
Viral Genome 

[0146] The HCV cDNA sequence information in the newly isolated clones described in the Examples may be used 
to gain further information on the sequence of the HCV genome, and for identification and isolation of the HCV agent, 
20 and thus will aid in its characterization including the nature of the genome, the structure of the viral particle, and the 
nature of the antigens of which it is composed. This information, in turn, can lead to additional polynucleotide probes, 
polypeptides derived from the HCV genome, and antibodies directed against HCV epitopes which would be useful for 
the diagnosis and/or treatment of HCV caused NANBH. 

[0147] The cDNA sequence information in the above-mentioned clones is useful for the design of probes for the 
25 isolation of additional cDNA sequences which are derived from as yet undefined regions of the HCV genome(s) from 
which the cDNAs in clones described herein and in EP 0,31 6,21 8 are derived. For example, labeled probes containing 
a sequence of approximately 8 or more nucleotides, and preferably 20 or more nucleotides, which are derived from 
regions close to the 5'-termini or 3'-termini of the composite HCV cDNA sequence shown in Fig. 17 may be used to 
isolate overlapping cDNA sequences from HCV cDNA libraries. Alternatively, characterization of the genomic segments 
30 could be from the viral genome(s) isolated from purified HCV particles. Methods for purifying HCV particles and for 
detecting them during the purification procedure are described herein, infra. Procedures for isolating polynucleotide 
genomes from viral particles are known in the art, and one procedure which may be used is that described in EP 
0,21 8,316. The isolated genomic segments could then be cloned and sequenced. An example of this technique, which 
utilizes amplification of the sequences to be cloned, is provided infra., and yielded clone 16jh. 
35 [0148] Methods for constructing cDNA libraries are known in the art, and are discussed supra and infra; a method 
for the construction of HCV cDNA libraries in lambda-gtll is discussed in EPO Pub. No. 318,216. However, cDNA 
libraries which are useful for screening with nucleic acid probes may also be constructed in other vectors known in the 
art, for example, Iambda-gt10 (Huynh et al. (1985)). 

40 Screening for Anti-Viral Agents for HCV 

[0149] The availability of cell culture and animal model systems for HCV makes it possible to screen for anti-viral 
agents which inhibit HCV replication, and particularly for those agents which preferentially allow cell growth and mul- 
tiplication while inhibiting viral replication. These screening methods are known by those of skill in the art. Generally, 
45 the anti-viral agents are tested at a variety of concentrations, for their effect on preventing viral replication in cell culture 
systems which support viral replication, and then for an inhibition of infectivtty or of viral pathogenicity (and a low level 
of toxicity) in an animal model system. 

[0150] The methods and compositions provided herein for detecting HCV antigens and HCV polynucleotides are 
useful for screening of anti-viral agents in that they provide an alternative, and perhaps more sensitive means, for 

50 detecting the agenfs effect on viral replication than the cell plaque assay or ID^ assay. For example, the HCV-poty- 
nucleotide probes described herein may be used to quantitate the amount of viral nucleic acid produced in a cell culture. 
This could be accomplished, for example, by hybridization or competition hybridization of the infected cell nucleic acids 
with a labeled HCV-poty nucleotide probe. For example, also, anti-HCV antibodies may be used to identify and quantitate 
HCV anttgen(s) in the cell culture utilizing the immunoassays described herein. In addition, since it may be desirable 

55 to quantitate HCV antigens in the infected cell culture by a competition assay, the polypeptides encoded within the 
HCV cDNAs described herein are useful in these competition assays. Generally, a recombinant HCV polypeptide de- 
rived from the HCV cDNA would be labeled, and the inhibition of binding of this labeled polypeptide to an HCV polypep- 
tide due to the antigen produced in the cell culture system would be monitored. Moreover, these techniques are par- 
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ticuiarty useful in cases where the HCV may be able to replicate in a cell line without causing cell death. 
[0151] The anti-viral agents which may be tested for efficacy by these methods are known in the art, and include, 
for example, those which interact with virion components and/or cellular components which are necessary for the 
binding and/or replication of the virus. Typical anti-viral agents may include, for example, inhibitors of virion polymerase 

5 and/or protease(s) necessary for cleavage of the precursor polypeptides. Other anti-viral agents may include those 
which act with nucleic acids to prevent viral replication, for example, anti-sense polynucleotides, etc. 
[0152] Antisense polynucleotides molecules are comprised of a complementary nucleotide sequence which allows 
them to hybridize specifically to designated regions of genomes or RNAs. Antisense polynucleotides may include, for 
example, molecules that will block protein translation by binding to mRNA, or may be molecules which prevent repli- 

io cation of viral RNA by transcriptase. They may also include molecules which carry agents (non-covalently attached or 
covalently bound) which cause the viral RNA to be inactive by causing, for example, scissions in the viral RNA. They 
may also bind to cellular polynucleotides which enhance and/or are required for viral infectivity, replicative ability, or 
chronicity. Antisense molecules which are to hybridize to HCV derived RNAs may be designed based upon the se- 
quence information of the HCV cDNAs provided herein. The antiviral agents based upon anti-sense polynucleotides 

15 for HCV may be designed to bind with high specificity, to be of increased solubility, to be stable, and to have low toxicity. 
Hence, they may be delivered in specialized systems, for example, liposomes, or by gene therapy. In addition, they 
may include analogs, attached proteins, substituted or altered bonding between bases, etc. 

[0153] Other types of drugs may be based upon polynucleotides which "mimic" important control regions of the HCV 
genome, and which may be therapeutic due to their interactions with key components of the system responsible for 
20 viral infectivity or replication. 

General Methods 

[0154] The general techniques used in extracting the genome from a virus, preparing and probing a cDNA library, 
25 sequencing clones, constructing expression vectors, transforming cells, performing immunological assays such as 
radioimmunoassays and ELISA assays, for growing cells in culture, and the like are known in the art and laboratory 
manuals are available describing these techniques. However, as a general guide, the following sets forth some sources 
currently available for such procedures, and for materials useful in carrying them out. 

[0155] Both prokaryotic and eukaryotic host cells may be used for expression of desired coding sequences when 

30 appropriate control sequences which are compatible with the designated host are used. Among prokaryotic hosts, E. 
coli is most frequently used. Expression control sequences for prokaryotes include promoters, optionally containing 
operator portions, and ribosome binding sites. Transfer vectors compatible with prokaryotic hosts are commonly derived 
from, for example, pBR322, a plasmid containing operons conferring ampicillin and tetracycline resistance, and the 
various pUC vectors, which also contain sequences conferring antibiotic resistance markers. These markers may be 

35 used to obtain successful transformants by selection. Commonly used prokaryotic control sequences include the Beta- 
lactamase (penicillinase) and lactose promoter systems (Chang et al. (1977)), the tryptophan (trp) promoter system 
(Goeddel et al. (1 980)) and the lambda-derived P L promoter and N gene ribosome binding site (Shimatake et al. (1 981 )) 
and the hybrid tac promoter (De Boer et al. (1983)) derived from sequences of the trp_ and lac UV5 promoters. The 
foregoing systems are particularly compatible with E. coli; if desired, other prokaryotic hosts such as strains of Bacillus 

40 or Pseudomonas may be used, with corresponding control sequences. 

[0156] Eukaryotic hosts include yeast and mammalian cells in culture systems. Saccharomyces cerevisiae and Sac- 
charomyces carlsbergensis are the most commonly used yeast hosts, and are convenient fungal hosts. Yeast com- 
patible vectors carry markers which permit selection of successful transformants by conferring prototrophy to auxo- 
trophic mutants or resistance to heavy metals on wild-type strains. Yeast compatible vectors may employ the 2 micron 

45 origin of replication (Broach et al. (1983)), the combination of CEN3 and ARS1 or other means for assuring replication, 
such as sequences which will result in incorporation of an appropriate fragment into the host cell genome. Control 
sequences for yeast vectors are known in the art and include promoters for the synthesis of glycolytic enzymes (Hess 
et al. (1968); Holland et al. (1978)), including the promoter for 3 phosphogrycerate kinase (Hitzeman (1980)). Termi- 
nators may also be included, such as those derived from the enolase gene (Holland (1981)). Particularly useful control 

50 systems are those which comprise the glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter or alcohol 
dehydrogenase (ADH) regulatable promoter, terminators also derived from GAPDH, and if secretion is desired, leader 
sequence from yeast alpha factor. In addition, the transcriptional regulatory region and the transcriptional initiation 
region which are operably linked may be such that they are not naturally associated in the wild-type organism. These 
systems are described in detail in EP0 120,551 , published October 3, 1984; EP0 116,201 , published August 22, 1984; 

55 and EPO 164,556, published December 18, 1985, all of which are assigned to the herein assignee, and are hereby 
incorporated herein by reference. 

[0157] Mammalian cell lines available as hosts for expression are known in the art and include many immortalized 
cell lines available from the American Type Culture Collection (ATCC), including HeLa cells, Chinese hamster ovary 
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(CHO) cells, baby hamster kidney (BHK) cells, and a number of other cell lines. Suitable promoters for mammalian 
cells are also known in the art and include viral promoters such as that from Simian Virus 40 (SV40) (Fiers (1978)), 
Rous sarcoma virus (RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may also require 
terminator sequences and poly A addition sequences; enhancer sequences which increase expression may also be 
5 included, and sequences which cause amplification of the gene may also be desirable. These sequences are known 
in the art. Vectors suitable for replication in mammalian cells may include viral replicons, or sequences which insure 
integration of the appropriate sequences encoding NANBV epitopes into the host genome. 

[0158] Transformation may be by any known method for introducing polynucleotides into a host cell, including, for 
example packaging the polynucleotide in a virus and transducing a host cell with the virus, and by direct uptake of the 

>0 polynucleotide. The transformation procedure used depends upon the host to be transformed. For example, transfor- 
mation of the E. coli host cells with lambda-gtll containing BB-NANBV sequences is discussed in the Example section, 
infra. Bacterial transformation by direct uptake generally employs treatment with calcium or rubidium chloride (Cohen 
(1972); Maniatis (1982)). Yeast transformation by direct uptake may be carried out using the method of Hinnen et ai. 
(1978). Mammalian transformations by direct uptake may be conducted using the calcium phosphate precipitation 

'5 method of Graham and Van der Eb (1 978), or the various known modifications thereof. 

[0159] Vector construction employs techniques which are known in the art. Site-specific DNA cleavage is performed 
by treating with suitable restriction enzymes under conditions which generally are specified by the manufacturer of 
these commercially available enzymes. In general, about 1 microgram of plasmid or DNA sequence is cleaved by 1 
unit of enzyme in about 20 microliters buffer solution by incubation of 1 -2 hr at 37° C. After incubation with the restriction 

20 enzyme, protein is removed by phenol/chloroform extraction and the DNA recovered by precipitation with ethanol. The 
cleaved fragments may be separated using poryacrylamide or agarose gel electrophoresis techniques, according to 
the general procedures found in Methods in Enzymology (1980) 65:499-560. 

[0160] Sticky ended cleavage fragments may be blunt ended using E. coli DNA polymerase I (Klenow) in the presence 
of the appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment with S1 nuclease may 

25 also be used, resulting in the hydrolysis of any single stranded DNA portions. 

[0161] Ligations are carried out using standard buffer and temperature conditions using T4 DNA ligase and ATP; 
sticky end ligations require less ATP and less ligase than blunt end ligations. When vector fragments are used as part 
of a ligation mixture, the vector fragment is often treated with bacterial alkaline phosphatase (BAP) or calf intestinal 
alkaline phosphatase to remove the 5'-phosphate and thus prevent religation of the vector; alternatively, restriction 

30 enzyme digestion of unwanted fragments can be used to prevent ligation. 

[0162] Ligation mixtures are transformed into suitable cloning hosts, such as E. coli, and successful transformants 
selected by, for example, antibiotic resistance, and screened for the correct construction. 

[0163] Synthetic oligonucleotides may be prepared using an automated oligonucleotide synthesizer as described by 
Warner (1984). If desired the synthetic strands may be labeled with 32 P by treatment with polynucleotide kinase in the 

35 presence of 32 P-ATP, using standard conditions for the reaction. 

[01 64] DNA sequences, including those isolated from cDNA libraries, may be modified by known techniques, includ- 
ing, for example site directed mutagenesis, as described by Zoller (1 982). Briefly, the DNA to be modified is packaged 
into phage as a single stranded sequence, and converted to a double stranded DNA with DNA polymerase using, as 
a primer, a synthetic oligonucleotide complementary to the portion of the DNA to be modified, and having the desired 

40 modification included in its own sequence. The resulting double stranded DNA is transformed into a phage supporting 
host bacterium. Cultures of the transformed bacteria, which contain replications of each strand of the phage, are plated 
in agar to obtain plaques. Theoretically, 50% of the new plaques contain phage having the mutated sequence, and the 
remaining 50% have the original sequence. Replicates of the plaques are hybridized to labeled synthetic probe at 
temperatures and conditions which permit hybridization with the correct strand, but not with the unmodified sequence. 

45 The sequences which have been identified by hybridization are recovered and cloned. 

[01 65] DNA libraries may be probed using the procedure of Grunstein and Hogness (1 975). Briefly, in this procedure, 
the DNA to be probed is immobilized on nitrocellulose filters, denatured, and prehybridized with a buffer containing 
0-50% formamide, 0.75 M NaCI, 75 mM Na citrate, 0.02% (wt/v) each of bovine serum albumin, polyvinyl pyrollidone, 
and Ficoll, 50 mM Na Phosphate (pH 6.5), 0.1%SDS, and 100 micrograms/ml carrier denatured DNA. The percentage 

50 of formamide in the buffer, as well as the time and temperature conditions of the pre hybridization and subsequent 
hybridization steps depends on the stringency required. Oligomeric probes which require lower stringency conditions 
are generally used with low percentages of formamide, lower temperatures, and longer hybridization times. Probes 
containing more than 30 or 40 nucleotides such as those derived from cDNA or genomic sequences generally employ 
higher temperatures, e.g., about 40-42°C, and a high percentage, e.g., 50%, formamide. Following prehybridization, 

55 5'- 32 P-labeled oligonucleotide probe is added to the buffer, and the filters are incubated in this mixture under hybridi- 
zation conditions. After washing, the treated filters are subjected to autoradiography to show the location of the hybrid- 
ized probe; DNA in corresponding locations on the original agar plates is used as the source of the desired DNA. 
[0166] For routine vector constructions, ligation mixtures are transformed into E. coli strain HB1 01 or other suitable 


22 


EP 1 034 785 A2 


host, and successful transformants selected by antibiotic resistance or other markers. Plasmids from the transformants 
are then prepared according to the method of Clewell et al. (1969), usually following chloramphenicol amplification 
(Clewell (1972)). The DNA is isolated and analyzed, usually by restriction enzyme analysis and/or sequencing. Se- 
quencing may be by the dideoxy method of Sanger et al. (1 977) as further described by Messing et al. (1 981), or by 
5 the method of Maxam et al. (1980). Problems with band compression, which are sometimes observed in GC rich 
regions, were overcome by use of T-deazoguanosine according to Barr et al. (1986). 

[0167] The enzyme-linked immunosorbent assay (ELISA) can be used to measure either antigen or antibody con- 
centrations. This method depends upon conjugation of an enzyme to either an antigen or an antibody, and uses the 
bound enzyme activity as a quantitative label. To measure antibody, the known antigen is fixed to a solid phase (e.g., 

to a microplate or plastic cup), incubated with test serum dilutions, washed, incubated with anti-immunoglobulin labeled 
with an enzyme, and washed again. Enzymes suitable for labeling are known in the art, and include, for example, 
horseradish peroxidase. Enzyme activity bound to the solid phase is measured by adding the specific substrate, and 
determining product formation or substrate utilization colorimetrically. The enzyme activity bound is a direct function 
of the amount of antibody bound. 

15 [01 68] To measure antigen, a known specific antibody is fixed to the solid phase, the test material containing antigen 
is added, after an incubation the solid phase is washed, and a second enzyme-labeled antibody is added. After washing, 
substrate is added, and enzyme activity is estimated colorimetrically, and related to antigen concentration. 

Examples 

20 

[0169] Described below are examples of the present invention which are provided only for illustrative purposes, and 
not to limit the scope of the present invention. In light of the present disclosure, numerous embodiments within the 
scope of the claims will be apparent to those of ordinary skill in the art. 

25 Isolation and Sequence of Overlapping HCV cDNA Clones 13i, 26j, CA59a, CA84a, CA156e and CA167b 

[0170] The clones 13i, 26j, CA59a, CA84a, CA156e and CA167b were isolated from the lambda-gtll library which 
contains HCV cDNA (ATCC No. 40394), the preparation of which is described in EPO Pub. No. 318,216 (published 31 
May 1989), and WO 89/04669 (published 1 June 1989). Screening of the library was with the probes described infra., 
30 using the method described in Huynh (1 985). The frequencies with which positive clones appeared with the respective 
probes was about 1 in 50,000. 

[0171] The isolation of clone 13i was accomplished using a synthetic probe derived from the sequence of clone 12f. 
The sequence of the probe was: 

35 

5 ' GAA CGT TGC GAT CTG GAA GAC AGG GAC AGG 3 ' , 

[0172] The isolation of clone 26j was accomplished using a probe derived from the 5'-region of clone K9-1. The 
40 sequence of the probe was: 

5 ' TAT CAG TTA TGC CAA CGG AAG CGG CCC CGA 3 ' . 

45 

[0173] The isolation procedures for clone 12f and for clone k9-1 (also called K9-1) are described in EPO Pub. No. 
318,216, and their sequences are shown in Figs. 1 and 2, respectively. The HCV cDNA sequences of clones 13i and 
26j, are shown in Figs. 4 and 5, respectively. Also shown are the amino acids encoded therein, as well as the overlap 
of clone 13i with clone 12f, and the overlap of clone 26j with clone 13i. The sequences for these clones confirmed the 
50 sequence of clone K9-1. Clone K9-1 had been isolated from a different HCVcDNA library (See EP 0,218,316). 

[0174] Clone CA59a was isolated utilizing a probe based upon the sequence of the 5 '-region of clone 26j. The se- 
quence of this probe was: 

55 5 ' CTG GTT AGC AGG GCT TTT CTA TCA CCA CAA 3 ' . 

[0175] A probe derived from the sequence of clone CA59a was used to isolate clone CA84a. The sequence of the 
probe used for this isolation was: 
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5 ' AAG GTC CTG GTA GTG CTG CTG CTA TTT GCC 3 ' . 

[0176] Clone CA156e was isolated using a probe derived from the sequence of clone CA84a. The sequence of the 
probe was: 

5 ' ACT GGA CGA CGC AAG GTT GCA ATT GCT CTA 3 ' . 

[0177] Clone CA167b was isolated using a probe derived from the sequence of clone CA 156e. The sequence of 
the probe was: 

5 ' TTC GAC GTC ACA TCG ATC TGC TTG TCG GGA 3 ' . 

[0178] The nucleotide sequences of the HCV cDNAs in clones CA59a, CA84a, CA156e, and CA167b, are shown 
Figs. 6, 7, 8, and 9, respectively. The amino acids encoded therein, as well as the overlap with the sequences of 
relevant clones, are also shown in the Figs. 

Creation of "pi" HCV cDNA Library 

[0179] A library of HCV cDNA, the "pi" library, was constructed from the same batch of infectious chimpanzee plasma 
used to construct the Iambda-gt11 HCV cDNA library (ATCC No. 40394) described in EPO Pub. No. 318,216, and 
utilizing essentially the same techniques. However, construction of the pi library utilized a primer-extension method, 
in which the primer for reverse transcriptase was based on the sequence of clone CA59A. The sequence of the primer 
was: 

5 ' GGT GAC GTG GGT TTC 3 ' . 


Isolation and Sequence of Clone pi!4a 

* 

[0180] Screening of the "pi" HCV cDNA library described supra., with the probe used to isolate clone CA167b (See 
supra.) yielded cbne pil4a. The clone contains about 800 base pairs of cDNA which overlaps clones CA167b, CA156e, 
CA84a and CA59a, which were isolated from the lambda gt-11 HCV cDNA library (ATCC No. 40394). In addition, pi1 4a 
also contains about 250 base pairs of DNA which are upstream of the HCV cDNA in clone CA167b. 

Isolation and sequence of Clones CA21 6a, CA290a and ag30a 

[0181] Based on the sequence of clone CA1 67b a synthetic probe was made having the following sequence: 

5 ' GGC TTT ACC ACG TCA CCA ATG ATT GCC CTA 3 ' 

The above probe was used to screen the Iambda-gt11 library (ATCC No 40394) which yielded clone CA216a, whose 
HCV sequences are shown in Fig. 10. 

[0182] Another probe was made based on the sequence of clone CA21 6a having the following sequence: 


5 ' TTT GGG TAA GGT CAT CGA TAC CCT TAC GTG 3 ' 

Screening the Iambda-gt11 library (ATCC No. 40394) with this probe yielded clone CA290a, the HCV sequences therein 
being shown in Fig. 11 . 

[0183] In a parallel approach, a primer-extension cDNA library was made using nucleic acid extracted from the same 
infectious plasma used in the original lambda-gtll cDNA library described above. The primer used was based on the 
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sequence of clones CA21 6a and CA290a: 

5 ' GAA GCC GCA CGT AAG 3 ' 

5 

The cDNA library was made using methods similar to those described previously for libraries used in the isolation of 
clones pil4a and k9-1 . The probe used to screen this library was based on the sequence of clone CA290a: 

10 

5' CCG GCG TAG GTC GCG CAA TTT GGG TAA 3' 

Clone ag30a was isolated from the new library with the above probe, and contained about 670 basepairs of HCV 
sequence. See Fig. 1 2. Part of this sequence overlaps the HCV sequence of clones CA216a and CA290a. About 300 
'5 base-pairs of the ag30a sequence, however, is upstream of the sequence from clone CA290a. The non-overlapping 
sequence shows a start codon (*) and stop codons that may indicate the start of the HCV ORR Also indicated in Fig. 
12 are putative small encoded peptides (#) which may play a role in regulating translation, as well as the putative first 
amino acid of the putative polypeptide (/), and downstream amino acids encoded therein. 

20 Isolation and Sequence of Clone CA205a 

[0184] Clone CA205a was isolated from the original lambda gt-11 library (ATCC No. 40394), using a synthetic probe 
derived from the HCV sequence in clone CA290a (Fig. 11). The sequence of the probe was: 

25 

5 ' TCA GAT CGT TGG TGG AGT TTA CTT GTT GCC 3 ' . 

The sequence of the HCV cDNA in CA205a, shown in Fig. 1 3, overlaps with the cDNA sequences in both clones ag30a 
30 and CA290a. The overlap of the sequence with that of CA290a is shown by the dotted line above the sequence (the 
figure also shows the putative amino acids encoded in this fragment). 

[0185] As observed from the HCV cDNA sequences in clones CA205a and ag30a, the putative HCV polyprotein 
appears to begin at the ATG start codon; the HCV sequences in both clones contain an in-frame, contiguous double 
stop codon (TGATAG) forty two nucleotides upstream from this ATG. The HCV ORF appears to begin after these stop 
35 codons, and to extend for at least 8907 nucleotides (See the composite HCV cDNA shown in Fig. 17). 

Isolation and Sequence of Clone 18q 

[0186] Based on the sequence of clone ag30a (See Fig. 12) and of an overlapping clone from the original lambda 
*o gt-11 library (ATCC No. 40394), CA230a, a synthetic probe was made having the following sequence: 

5 ' CCA TAG TGG TCT GCG GAA CCG GTG AGT ACA 3 ' . 

45 

Screening of the original iambda-gt11 HCV cDNA library with the probe yielded clone 18g, the HCV cDNA sequence 
of which is shown in Fig. 14. Also shown in the figure are the overlap with clone ag30a, and putative polypeptides 
encoded within the HCV cDNA. 

[0187] The cDNA in clone 18g (C18g or 18g) overlaps that in clones ag30a and CA205a, described supra. The 
50 sequence of C18g also contains the double stop codon region observed in clone ag30a. The polynucleotide region 
upstream of these stop codons presumably represents part of the 5'-region of the HCV genome, which may contain 
short ORFs, and which can be confirmed by direct sequencing of the purified HCV genome. These putative small 
encoded peptides may play a regulatory role in translation. The region of the HCV genome upstream of that represented 
by C18g can be isolated for sequence analysis using essentially the technique described in EPO Pub. No. 318,216 
55 for isolating cDNA sequences upstream of the HCV cDNA sequence in clone 1 2f. Essentially, small synthetic oligonu- 
cleotide primers of reverse transcriptase, which are based upon the sequence of C18g, are synthesized and used to 
bind to the corresponding sequence in HCV genomic RNA. The primer sequences are proximal to the known 5'4erminal 
of C1 8g, but sufficiently downstream to allow the design of probe sequences upstream of the primer sequences. Known 
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standard methods of priming and cloning ar eused. The resulting cDNA libraries are screened with sequences upstream 
of the priming sites (as deduced from the elucidated sequence of C18g). The HCV genomic RNA is obtained from 
either plasma or liver samples from individuals with NANBH. Since HCV appears to be a Ravi-like virus, the 5'-terminus 
of the genome may be modified with a "cap* structure. It is known that Flavivirus genomes contain 5" -terminal "cap" 
5 structures. (Yellow Fever virus, Rice etal. (1988); Dengue virus, Hahnetal (1988); Japanese Encephalitis Virus (1987)). 

Isolation and Sequence of Clones from the beta-HCV cDNA library 

[0188] Clones containing cDNA representative of the 3' -terminal region of the HCV genome were isolated from a 
'0 cDNA library constructed from the original infectious chimpanzee plasma pool which was used for the creation of the 
HCV cDNA Iambda-gt11 library (ATCC No. 40394), described in EPO Pub. No. 318,216. In order to create the DNA 
library, RNA extracted from the plasma was "tailed" with poly rA using poly (rA) polymerase, and cDNA was synthesized 
using oltgo(dT) 12 . l8 as a primer for reverse transcriptase. The resulting RNA:cDNA hybrid was digested with RNAase 
H, and converted to double stranded HCV cDNA. The resulting HCV cDNA was cloned into Iambda-gt10, using es- 
15 sentially the technique described in Huynh (1985), yielding the beta (or b) HCV cDNA library. The procedures used 
were as follows. 

[0189] An aliquot (12ml) of the plasma was treated with proteinase K, and extracted with an equal volume of phenol 
saturated with 0.05M Tris-CI, pH 7.5, 0.05% (v/v) beta-mercaptoethanol, 0.1% (w/v) hydroxyquinolone, 1 mM EDTA. 
The resulting aqueous phase was re-extracted with the phenol mixture, followed by 3 extractions with a 1 :1 mixture 

20 containing phenol and chloroform:isoamyl alcohol (24:1), followed by 2 extractions with a mixture of chloroform and 
isoamyl alcohol (1 :1 ). Subsequent to adjustment of the aqueous phase to 200 mM with respect to NaCI, nucleic acids 
in the aqueous phase were precipitated overnight at -20°C, with 2.5 volumes of cold absolute ethanol. The precipitates 
were collected by centrifugation at 10,000 RPM for 40 min., washed with 70% ethanol containing 20 mM NaCI, and 
with 100% cold ethanol, dried for 5 min. in a dessicator, and dissolved in water 

25 [0190] The isolated nucleic acids from the infectious chimpanzee plasma pool were tailed with poly rA utilizing poly- 
A polymerase in the presence of human placenta ribonuclease inhibitor (HPRI) (purchased from Amersham Corp.), 
utilizing MS2 RNA as carrier Isolated nucleic acids equivalent to that in 2 ml of plasma were incubated in a solution 
containing TMN (50 mM Tris HCI, pH 7.9, 10 mM MgCI 2 , 250 mM NaCI, 2.5 mM MnCf 2 , 2 mM dithiothreitol (DTT)), 40 
micromolaralpha-^P] ATP, 20 units HPRI (Amersham Corp.), and about 9 to 1 0 units of RNase free poly-A polymerase 

30 (BRL). Incubation was for 1 0 min. at 37°C, and the reactions were stopped with EDTA (final concentration about 250 
mM). The solution was extracted with an equal volume of phenol-chloroform, and with an equal volume of chloroform, 
and nucleic acids were precipitated overnight at -2(FC with 2.5 volumes of ethanol in the presence of 200 mM NaCI. 

Isolation of Clone b5a 

35 

[0191] The beta HCV cDNA library was screened by hybridization using a synthetic probe, which had a sequence 
based upon the HCVcDNA sequence in clone 15e. The isolation of clone 15e is described in EPO Pub. No. 318,216, 
and its sequence is shown in Fig. 3. The sequence of the synthetic probe was: 

40 

5 ' ATT GCG AG A TCT ACG GGG CCT GCT ACT CCA 3 ' . 

Screening of the library yielded clone beta-5a (b5a), which contains an HCV cDNA region of approximately 1 000 base 
45 pairs. The 5'-region of this cDNA overlaps clones 35f, 19g, 26g, and 15e (these clones are described supra). The 

region between the 3'-terminal poly-A sequence and the 3'-sequence which overlaps clone 1 5e, contains approximately 

200 base pairs. This clone allows the identification of a region of the 3' -terminal sequence the HCV genome. 

[0192] The sequence of b5a is contained within the sequence of the HCV cDNA in clone 16jh (described infra). 

Moreover, the sequence is also present in CC34a, isolated from the original lambda-o,t11 library (ATCC No. 40394). 
50 (The original lambda-gtli library is referred to herein as the "C" library). 

Isolation and Sequence of Clones Generated by PCR Amplification of the 3'-Region of the HCV Genome 

[0193] Multiple cDNA clones have been generated which contain nucleotide sequences derived from the 3* -region 
55 of the HCV genome. This was accomplished by amplifying a targeted region of the genome by a polymerase chain 
reaction technique described in Saiki et al. (1986), and in Saiki et al. (1988), which was modified as described below. 
The HCV RNA which was amplified was obtained from the original infectious chimpanzee plasma pool which was used 
for the creation of the HCV cDNA lambda-gtll library (ATCC No. 40394) described in EPO Pub. No. 31 8,21 6. Isolation 
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of the HCV RNA was as described supra. The isolated RNA was tailed at the 3'-end with ATP by E. coli poly-A polymer- 
ase as described in Sippel (1973), except that the nucleic acids isolated from chimp serum were substituted for the 
nucleic acid substrate. The tailed RNA was then reverse transcribed into cDNA by reverse transcriptase, using an oligo 
oT-primer adapter, essentially as described by Han (1987), except that the components and sequence of the primer- 
adapter were: 


Staffer Not I 5P6 Promoter Primer 

AATTC GCGGCCGC CATACGATTTAGGTGACACTATAGAA T 15 


The resultant cDNA was subjected to amplification by PCR using two primers: 


Primer Sequence 

JH32 (30mer) ATAGCGGCCGCCCTCGATTGCGAGATCTAC 
JH1 1 ( 20mer ) AATTC GGGC GGCC GC C ATAC GA 

The JH32 primer contained 20 nucleotide sequences hybridizable to the 5'-end of the target region in the cDNA, with 
an estimated T m of 66°C. The JH11 was derived from a portion of the oligo dT-primer adapter; thus, it is specific to the 
3'-end of the cDN A with a T m of 64°C. Both primers were designed to have a recognition site for the restriction enzyme, 
Notl, at the 5'-end, for use in subsequent cloning of the amplified HCV cDNA. 

[0194] The PCR reaction was carried out by suspending the cDNA and the primers in 100 microliters of reaction 
mixture containing the four deoxy nucleoside triphosphates, buffer salts and metal ions, and a thermostable DNA 
polymerase isolated from Thermus aquaticus (Taq polymerase), which are in a Perkin Etmer Cetus PCR kit (N801 -0043 
or N801 -0055). The PCR reaction was performed for 35 cycles in a Perkin Elmer Cetus DNA thermal cycler. Each 
cycle consisted of a 1.5 min denaturation step at 94°C, an annealing step at 60°C for 2 min, and a primer extension 
step at 72°C for 3 min. The PCR products were subjected to Southern blot analysis using a 30 nucleotide probe, JH34, 
the sequence of which was based upon that of the 3'-terminal region of clone 15e. The sequence of JH34 is: 


5 ' CTT GAT CTA CCT CCA ATC ATT CAA AGA CTC 3 ' . 


The PCR products detected by the HCV cDNA probe ranged in size from about 50 to about 400 base pairs. 
[0195] In order to clone the amplified HCV cDNA, the PCR products were cleaved with Notl and size selected by 
poryacrylamide gel electrophoresis. DNA larger than 300 base pairs was cloned into the Notl site of pUC1 8S The vector 
pUC18S is constructed by including a Notl polylinker cloned between the EcoRI and Sail sites of pUC18. The clones 
were screened for HCV cDNA using the JH34 probe. A number of positive clones were obtained and sequenced. The 
nucleotide sequence of the HCV cDNA insert in one of these clones, 16jh, and the amino acids encoded therein, are 
shown in Fig. 15. A nucleotide heterogeneity, detected in the sequence of the HCV cDNA in clone 16jh as compared 
to another clone of this region, is indicated in the figure. 

Compiled HCV cDNA Sequences 

[0196] An HCV cDNA sequence has been compiled from a series of overlapping clones derived from the various 
HCV cDNA libraries described supra.. In this sequence, the compiled HCV cDNA sequence obtained from clones 
b114a t 18g, ag30a, CA205a, CA290a, CA216a, pil4a, CA167b, CA156e, CA84a, and CA59a is upstream of the com- 
piled HCV cDNA sequence published in EPO Pub. No. 31 8,216, which is shown in Fig. 16. The compiled HCV cDNA 
sequence obtained from clones b5a and 1 6jh downstream of the compiled HCV cDNA sequence published in EPO 
Pub. No. 318,216. 

[0197] Fig. 17 shows the compiled HCVcDNA sequence derived from the above-described clones and the compiled 
HCV cDNA sequence published in EPO Pub. No. 318,216. The clones from which the sequence was derived are 
b114a, 18g, ag30a, CA205a, CA290a, CA216a, pil4a, CA167b, CA156e, CA84a, CA59a, K9-1 (also called k9-1),26j, 
13i, 12f, 14i, lib, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 32, 33b, 25c, 14c, 8f, 33f, 33g, 39c, 35f, 19g, 26g, 15e. b5a, and 
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16jh. In the figure the three dashes above the sequence indicate the position of the putative initiator methionine codon. 
[0198] Clone b1 14a was obtained using the cloning procedure described for clone b5a, supra., except that the probe 
was the synthetic probe used to detect clone 18g, supra. Clone b114a overlaps with clones 18g, ag30a, and CA205a, 
except that clone bit 4a contains an extra two nucleotides upstream of the sequence in clone 1 8g (i.e., 5"-CA). These 

5 extra two nucleotides have been included in the HCV genomic sequence shown in Fig. 1 7. 

[0199] It should be noted that although several of the clones described supra, have been obtained from libraries 
otherthan the original HCVcDNA Iambda-gt11 C library (ATCCNo. 40394), these clones contain HCV cDNA sequences 
which overlap HCV cDNA sequences in the original library. Thus, essentially all of the HCV sequence is derivable from 
the original Iambda-gt1 1 C library (ATCC No. 40394) which was used to isolate the first HCV cDNA clone (5-1 -1 ). The 

10 isolation of clone 5-1-1 is described in EPO Pub. No. 318,216. 

Purification of Fusion Polypeptide C 100-3 (Alternate method) 

[0200] The fusion polypeptide, C1 00-3 (also called HCV d 00-3 and alternatively, c1 00-3), is comprised of superoxide 
'5 dismutase (SOD) at the N-terminus an in-frame C100 HCV polypeptide at the C-terminus. A method for preparing the 
polypeptide by expression in yeast, and differential extraction of the insoluble fraction of the extracted host yeast cells, 
is described in EPO Pub. No. 31 8,21 6. An alternative method for the preparation of this fusion polypeptide is described 
below. In this method the antigen is precipitated from the crude cell lysate with acetone; the acetone precipitated antigen 
is then subjected to ion-exchange chromatography, and further purified by gel filtration. 
20 [0201] The fusion polypeptide, C1 00-3 (HCV ct 00-3), is expressed in yeast strain JSC 308 (ATCC No. 20879) trans- 
formed with pAB24C1 00-3 (ATCC No. 67976); the transformed yeast are grown under conditions which allow expres- 
sion (i.e., by growth in YEP containing 1% glucose). (See EPO Pub. No. 318,216). A cell lysate is prepared by sus- 
pending the cells in Buffer A (20 mM Tris HCI, pH 8.0, 1 mM EDTA, 1 mM PMSF The cells are broken by grinding with 
glass beads in a Dynomill type homogenizer or its equivalent. The extent of cell breakage is monitored by counting 
25 cells under a microscope with phase optics. Broken cells appear dark, while viable cells are light-colored. The percent- 
age of broken cells is determined. 

[0202] When the percentage of broken cells is approximately 90% or greater, the broken cell debris is separated 
from the glass beads by centrifugation, and the glass beads are washed with Buffer A. After combining the washes 
and homogenate, the insoluble material in the lysate is obtained by centrifugation. The material in the pellet is washed 

30 to remove soluble proteins by suspension in Buffer B (50 mM glycine, pH 12.0, 1 mM DTT, 500 mM NaCI), followed 
by Buffer C (50 mM glycine, pH 1 0.0, 1 mM DTT). The insoluble material is recovered by centrifugation, and solubilized 
by suspension in Buffer C containing SDS. The extract solution may be heated in the presence of beta-mercaptoethanol 
and concentrated by ultrafiltration. The HCV c100-3 in the extract is precipitated with cold acetone. If desired, the 
precipitate may be stored at temperatures at about or below -15°C. 

35 [0203] Prior to ion exchange chromatography, the acetone precipitated material is recovered by centrifugation, and 
may be dried under nitrogen. The precipitate is suspended in Buffer D (50 mM glycine, pH 10.0, 1 mM DTT, 7 M urea), 
and centrrfuged to pellet insoluble material. The supernatant material is applied to an anion exchange column previously 
equilibrated with Buffer D. Fractions are collected and analyzed by ultraviolet absorbance or gel electrophoresis on 
SDS polyacrylamide gels. Those fractions containing the HCV c100-3 polypeptide are pooled. 

40 [0204] In order to purify the HCV c100-3 polypeptide by gel filtration, the pooled fractions from the ion-exchange 
column are heated in the presence of beta-mercaptoethanol and SDS, and the eluate is concentrated by ultrafiltration. 
The concentrate is applied to a gel filtration column previously equilibrated with Buffer E (20 mM Tris HCI, pH 7.0, 1 
mM DTT, 0.1 % SDS). The presence of HCV c1 00-3 in the eluted fractions, as well as the presence of impurities, are 
determined by gel electrophoresis on polyacrylamide gels in the presence of SDS and visualization of the polypeptides. 

45 Those fractions containing purified HCV c1 00-3 are pooled. Fractions high in HCV c100-3 may be further purified by 
repeating the gel filtration process. If the removal of particulate material is desired, the HCV c1 00-3 containing material 
may be filtered through a 0.22 micron filter. 

Expression and Antigenicity of Polypeptides Encoded in HCV cDNA 

50 

Polypeptides Expressed in E. coli 

[0205] The polypeptides encoded in a number of HCV cDNAs which span the HCV genomic ORF were expressed 
in E. coli, and tested fortheir antigenicity using serum obtained from a variety of individuals with NANBH. The expression 
55 vectors containing the cloned HCV cDNAs were constructed from pSODcf 1 (Steimer et al. (1 986). In order to be certain 
that a correct reading frame would be achieved, three separate expression vectors, pcflAB, pcfl CD, and pcfl EF were 
created by ligating either of three linkers, AB, CD, and EF to a BamHI-EcoRI fragment derived by digesting to completion 
the vector pSODcfl with EcoRI and BamHI, followed by treatment with alkaline phosphatase. The linkers were created 
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from six oligomers, A, B, C, D, E, and F. Each oligomer was phosphorylated by treatment with kinase in the presence 
of ATP prior to annealing to its complementary oligomer. The sequences of the synthetic linkers were the following. 


Name DNA Sequence (5' to 3') 

A GATC CTG AAT TCC TGA TAA 

B GAC TTA AGG ACT ATT TTA A 

C GATC CGA ATT CTG TGA TAA 

D GCT TAA GAC ACT ATT TTA A 

E GATC CTG GAA TTC TGA TAA 

F GAC CTT AAG ACT ATT TTA A 


Each of the three linkers destroys the original EcoRI site, and creates a new EcoRI site within the linker, but within a 
different reading frame. Hence, the HCV cDNA EcoRI fragments isolated from the clones when inserted into the ex- 
pression vector, were in three different reading frames. 

[0206] The HCV cDNA fragments in the designated lambda-gtll clones were excised by digestion with EcoRI; each 
fragment was inserted into pcflAB, pcflCD, and pcflEF. These expression constructs were then transformed into D1 21 0 
E. coli cells, the transformants were cloned, and recombinant bacteria from each clone were induced to express the 
fusion polypeptides by growing the bacteria in the presence of IPTG. 

[0207] Expression products of the indicated HCV cDNAs were tested for antigenicity by direct immunological screen- 
ing of the colonies, using a modification of the method described in Helfman et al. (1983). Briefly, as shown in Fig. 18, 
the bacteria were plated onto nitrocellulose filters overlaid on ampicillin plates to give approximately 1 ,000 colonies 
per filter. Colonies were replica plated onto nitrocellulose fitters, and the replicas were regrown overnight in the presence 
of 2 mM IPTG and ampicillin. The bacterial colonies were lysed by suspending the nitrocellulose filters for about 1 5 to 
20 min in an atmosphere saturated with CHCI 3 vapor. Each filter then was placed in an individual 100 mm Petri dish 
containing 10 ml of 50 mM Tris HCI, pH 7.5, 1 50 mM NaCI, 5 mM MgC^, 3% (w/v) BSA, 40 micrograms/ml lysozyme, 
and 0.1 microgram/ml DNase. The plates were agitated gently for at least 8 hours at room temperature. The filters 
were rinsed in TBST (50 mM Tris HCI, pH8.0, 150 mM NaCI, 0.005% Tween 20). After incubation, the cell residues 
were rinsed and incubated in TBS (TBST without Tween) containing 1 0% sheep serum; incubation was for 1 hour The 
filters were then incubated with p retreated sera in TBS from individuals with NANBH, which included: 3 chimpanzees; 
8 patients with chronic NANBH whose sera were positive with respect to antibodies to HCV C100-3 polypeptide (de- 
scribed in EPO Pub. No. 318,216, and supra.) (also called C100); 8 patients with chronic NANBH whose sera were 
negative for anti-C1 00 antibodies; a convalescent patient whose serum was negative for anti-C100 antibodies; and 6 
patients with community acquired NANBH, including one whose sera was strongly positive with respect to anti-C100 
antibodies, and one whose sera was marginally positive with respect to anti-C1 00 antibodies. The sera, diluted in TBS, 
was pretreated by preabsorption with hSOD. Incubation of the filters with the sera was for at least two hours. After 
incubation, the filters were washed two times for 30 min with TBST. Labeling of expressed proteins to which antibodies 
in the sera bound was accomplished by incubation for 2 hours with 125 l-labeled sheep anti-human antibody. After 
washing, the filters were washed twice for 30 min with TBST, dried, and autoradiographed. 

[0208] A number of clones (see infra.) expressed polypeptides containing HCV epitopes which were immunologically 
reactive with serum from individuals with NANBH. Five of these polypeptides were very immunogenic in that antibodies 
to HCV epitopes in these polypeptides were detected in many different patient sera. The clones encoding these polypep- 
tides, and the location of the polypeptide in the putative HCV polyprotein (wherein the amino acid numbers begin with 
the putative initiator codon) are the following : clone 5-1 -1 , amino acids 1 694-1 735; clone C1 00, amino acids 1 569-1 931 ; 
clone 33c, amino acids 1192-1457; clone CA279a, amino acids 1-84; and clone CA290a amino acids 9-177. The 
location of the immunogenic polypeptides within the putative HCV polyprotein are shown immediately below. 
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Clones encoding polypeptides of proven reactivity with sera from NANBH patients. 


[0209] 


Clone 

Location within the HCV potyprotein (amino acid no. beginning with putative initiator methionine) 

UA^/ya 

1 -04 

OA /4a 

j«07 COO 

4J/-OOV 

1 ol 

oi i -oyu 


Q 177 


1 1 y^- 1 40/ 

A flU 
4UD 

t ^bb- 1 hco 

5-1-1 

1694-1735 

81 

1689-1805 

33b 

1 91 6-2021 

25c 

1949-2124 

14c 

2054-2223 

8f 

2200-3325 

33f 

2287-2385 

33g 

2348-2464 

39c 

2371-2502 

15e 

2796-2886 

C100 

1569-1931 


[0210] The results on the immunogenicity of the polypeptides encoded in the various clones examined suggest ef- 
ficient detection and immunization systems may include panels of HCV polypeptides/epitopes. 


Expression of HCV Epitopes in Yeast 

[0211] Three different yeast expression vectors which allow the insertion of HCV cDNA into three different reading 
frames are constructed. The construction of one of the vectors, pAB24C100-3 is described in EPO Pub. No. 318,216. 
In the studies below, the HCV cDNA from the clones listed in supra, in the antigenicity mapping study using the E. coli 
expressed products are substituted for the C1 00 HCV cDNA. The construction of the other vectors replaces the adaptor 
described in the above E coli studies with one of the following adaptors: 

Adaptor 1 

ATT TTG AAT TCC TAA TGA G 

AC TTA AGG ATT ACT CAG CT 

Adaptor 2 

AAT TTG GAA TTC TAA TGA G 

AC CTT AAG ATT ACT CAG CT . 

The inserted HCV cDNA is expressed in yeast transformed with the vectors, using the expression conditions described 
supra, for the expression of the fusion polypeptide, C100-3. The resulting polypeptides are screened using the sera 
from individuals with NANBH, described supra, for the screening of immunogenic polypeptides encoded in HCVcDNAs 
expressed in E. coli. 
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Comparison of the Hydrophobic Profiles of HCV Polyproteins with West Nile Virus Pofyprotein and with Dengue Virus 
NS1 


[0212] The hydrophobicity profile of an HCV polyprotein segment was compared with that of a typical Flavivirus, 
West Nile virus. The polypeptide sequence of the West Nile virus polyprotein was deduced from the known polynucle- 
otide sequences encoding the non-structural proteins of that virus. The HCV polyprotein sequence was deduced from 
the sequence of overlapping cDNA clones. The profiles were determined using an antigen program which uses a 
window of 7 amino acid width (the amino acid in question, and 3 residues on each side) to report the average hydro- 
phobicity about a given amino acid residue. The parameters giving the reactive hydrophobicity for each amino acid 
residue are from Kyte and Doolittle (1982). Fig. 19 shows the hydrophobic profiles of the two polyproteins; the areas 
corresponding to the non-structural proteins of West Nile virus, nsl through ns5, are indicated in the figure. As seen in 
the figure, there is a general similarity in the profiles of the HCV polyprotein and the West Nile virus polyprotein. 
[0213] The sequence of the amino acids encoded in the 5'-region of HCV cDNA shown in Fig. 1 6 has been compared 
with the corresponding region of one of the strains of Dengue virus, described supra., with respect to the profile of 
regions of hydrophobicity and hydrophilicity (data not shown). This comparison indicated that the polypeptides from 
HCV and Dengue encoded in this region, which corresponds to the region encoding NS1 (or a portion thereof), have 
a similar hydrophobic/hydrophilic profile. 

[0214] The similarity in hydrophobicity profiles, in combination with the previously identified homologies in the amino 
acid sequences of HCV and Dengue Flavivirus in EP 0,21 8,31 6 suggests that HCV is related to these members of the 
Flavivirus family. 

Characterization of the Putative Polypeptides Encoded Within the HCV ORF 

[0215] The sequence of the HCV cDNA sense strand, shown in Fig. 17, was deduced from the overlapping HCV 
cDNAs in the various clones described in EPO Pub. No. 318,216 and those described supra. It may be deduced from 
the sequence that the HCV genome contains primarily one long continuous ORF, which encodes a polyprotein. In the 
sequence, nucleotide number 1 corresponds to the first nucleotide of the initiator MET codon; minus numbers indicate 
that the nucleotides are that distance away in the 5'-direction (upstream), white positive numbers indicate that the 
nucleotides are that distance away in the 3'-direction (downstream). The composite sequence shows the "sense" strand 
of the HCV cDNA. 

[0216] The amino acid sequence of the putative HCV polyprotein deduced from the HCV cDNA sense strand se- 
quence is also shown in Fig. 17, where position 1 begins with the putative initiator methionine. 
[0217] Possible protein domains of the encoded HCV polyprotein, as well as the approximate boundaries, are the 
following (the polypeptides identified within the parentheses are those which are encoded in the Flavivirus domain): 


Putative Domain 

Approximate Boundary (amino acid nos.) 

"C" (nucleocapsid protein) 

1-120 

"E" (Virion envelope protein(s) and possibly matrix (M) proteins 

1 20-400 

"NS1" (complement fixation antigen?) 

400-660 

"NS2" (unknown function) 

660-1 050 

"NS3" (protease?) 

1050-1640 

"NS4" (unknown function) 

1640-2000 

"NS5" (polymerase) 

2000-? end 


It should be noted, however, that hydrophobicity profiles (described infra), indicate that HCV diverges from the Flavivirus 
model, particularly with respect to the region upstream of NS2. Moreover, the boundaries indicated are not intended 
to show firm demarcations between the putative polypeptides. 


The Hydrophilic and Antigenic Profile of the Polypeptide 

[0218] Profiles of the hydrophilicity/hydrophobicity and the antigenic index of the putative polyprotain encoded in the 
HCV cDNA sequence shown in Fig. 16 were determined by computer analysis. The program for hydrophilicity/hydro- 
phobicity was as described supra. The antigenic index results from a computer program which relies on the following 
criteria: 1 ) surface probability, 2) prediction of alpha-helicity by two different methods; 3) prediction of beta-sheet regions 
by two different methods; 4) prediction of U-turns by two different methods; 5) hydrophilicity/hydrophobicity; and flex- 
ibility. The traces of the profiles generated by the computer analyses are shown in Fig. 20. In the hydrophilicity profile, 
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deflection above the abscissa indicates hydrophilicity, and below the abscissa indicates hydrophobicity. The probability 
that a polypeptide region is antigenic is usually considered to increase when there is a deflection upward from the 
abscissa in the hydrophilic and/or antigenic profile. It should be noted, however, that these profiles are not necessarily 
indicators of the strength of the immunogenicity of a polypeptide. 

Identification of Co-linear Peptides in HCV and Flaviviruses 

[0219] The amino acid sequence of the putative polyprotein encoded in the HCV cDNA sense strand was compared 
with the known amino acid sequences of several members of Flaviviruses. The comparison shows that homology is 
slight, but due to the regions in which it is found, it is probably significant. The conserved co-linear regions are shown 
in Fig. 21 . The amino acid numbers listed below the sequences represent the number in the putative HCV polyprotein 
(See Fig. 17.) 

[0220] The spacing of these conserved motifs is similar between the Flaviviruses and HCV, and implies that there 
is some similarity between HCV and these flaviviral agents. 

[0221] The following listed materials are on deposit under the terms of the Budapest Treaty with the American Type 
Culture Collection (ATCC), 12301 Parklawn Dr., Rockville, Maryland 20852, and have been assigned the following 
Accession Numbers. 


Iambda-gt11 

ATCC No. 

Deposit Date 

HCV cDNA library 

40394 

1 Dec. 1987 

clone 81 

40388 

17 Nov. 1987 

clone 91 

40389 

17 Nov. 1987 

clone 1-2 

40390 

17 Nov. 1987 

clone 5-1-1 

40391 

18 Nov. 1987 

clone 1 2f 

40514 

10 Nov. 1988 

clone 35f 

40511 

10 Nov. 1988 

clone 1 5e 

40513 

10 Nov. 1988 

clone K9-1 

40512 

10 Nov. 1988 

JSC 308 

20879 

5 May 1988 

pS356 

67683 

29 April 1988 


In addition, the following deposits were made on 11 May 1989. 


Strain 

Linkers 

ATCC No. 

D1210 (Cf1/5-1-1) 

EF 

67957 

D1210(Cf1/81) 

EF 

67968 

D1210(Cf1/CA74a) 

EF 

67969 

D1210 (Cf1/35f) 

AB 

67970 

D1 210 (CM /279a) 

EF 

67971 

D1210(Cf1/C36) 

CD 

67972 

D1210(Cf1/13i) 

AB 

67973 

D1210(Cf1/C33b) 

EF 

67974 

D1210(Cf1/CA290a) 

AB 

67975 

HB101 (AB24/C100#3R) 


67976 


The following derivatives of strain D1210 were deposited on 3 May 1989. 


Strain Derivative 

ATCC No. 

pCF1CS/C8f 

67956 

pCF1AB/C12f 

67952 

pCF1EF/14c 

67949 

pCF1EF/15e 

67954 

pCF1AB/C25c 

67958 
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(continued) 


Strain Derivative 

ATCC No. 

pCF1EF/C33c 

67953 

pCF1EF/C33f 

67050 

pCF1CD/33g 

67951 

pCF1CD/C39c 

67955 

pCF1EF/C40b 

67957 

pCF1EF/CA167b 

67959 


The following strains were deposited on May 12, 1989. 


Strain 

ATCC No. 

Lambda gt11(C35) 

40603 

Lambda gt10(beta-5a) 

40602 

D1210(C40b) 

67980 

D1210(M16) 

67981 


20 

[0222] The deposited materials mentioned herein are intended for convenience only, and are not required to practice 
the present invention in view of the descriptions herein, and in addition these materials are incorporated herein by 
reference. 

25 Industrial Applicability 

[0223] The invention, in the various manifestations disclosed herein, has many industrial uses, some of which are 
the following. The HCV cDNAs may be used for the design of probes for the detection of HCV nucleic acids in samples. 
The probes derived from the cDNAs may be used to detect HCV nucleic acids in, for example, chemical synthetic 
30 reactions. They may also be used in screening programs for anti-viral agents, to determine the effect of the agents in 
inhibiting viral replication in cell culture systems, and animal model systems. The HCV polynucleotide probes are also 
useful in detecting viral nucleic acids in humans, and thus, may serve as a basis for diagnosis of HCV infections in 
humans. 

[0224] In addition to the above, the cDN As provided herein provide information and a means for synthesizing polypep- 

35 tides containing epitopes of HCV. These polypeptides are useful in detecting antibodies to HCV antigens. A series of 
immunoassays for HCV infection, based on recombinant polypeptides containing HCV epitopes are described herein, 
and will find commercial use in diagnosing HCV induced NANBH, in screening blood bank donors for HCV-caused 
infectious hepatitis, and also for detecting contaminated blood from infectious blood donors. The viral antigens will also 
have utility in monitoring the efficacy of anti-viral agents in animal model systems. In addition, the polypeptides derived 

40 from the HCV cDNAs disclosed herein will have utility as vaccines for treatment of HCV infections. 

[0225] The polypeptides derived from the HCV cDNAs, besides the above stated uses, are also useful for raising 
anti-HCV antibodies. Thus, they may be used in anti-HCV vaccines. However, the antibodies produced as a result of 
immunization with the HCV polypeptides are also useful in detecting the presence of viral antigens in samples. Thus, 
they may be used to assay the production of HCV polypeptides tn chemical systems. The anti-HCV antibodies may 

45 also be used to monitor the efficacy of anti-viral agents in screening programs where these agents are tested in tissue 
culture systems. They may also be used for passive immunotherapy, and to diagnose HCV caused NANBH by allowing 
the detection of viral antigen(s) in both blood donors and recipients. Another important use for anti-HCV antibodies is 
in affinity chromatography for the purification of virus and viral polypeptides. The purified virus and viral polypeptide 
preparations may be used in vaccines. However, the purified virus may also be useful for the development of cell culture 

50 systems in which HCV replicates. 

(0226] Antisense polynucleotides may be used as inhibitors of viral replication. 

[0227] For convenience, the anti-HCV antibodies and HCV polypeptides, whether natural or recombinant, may be 
packaged into kits. 

55 

Claims 

1. A pharmaceutical composition comprising a hepatitis C virus (HCV) antisense polynucleotide which is selectively 
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hybridizable for the sense strand of the genome of an HCV, wherein the polynucleotide comprises a contiguous 
sequence of at least 8 nucleotides complementary to the sense strand of the genome of an HCV and a pharma- 
ceutically acceptable excipient, wherein HCV is characterized by: 

(i) a positive stranded RNA genome; 

(ii) said genome comprising an open reading frame (ORF) encoding a polyprotein; and 

(iii) said polyprotein is encoded by an HCV genome wherein the HCV genome has an homology at the polypep- 
tide level of at least 40% to the 169 amino acid sequence shown in Figure 11 . 

2. The pharmaceutical composition according to claim 1 , wherein the contiguous sequence is at least 10 nucleotides. 

3. The pharmaeutical composition according to claim 1 or claim 2 wherein the contiguous sequence is at least 12 
nucleotides. 

4. The pharmaceutical composition according to any of claims 1 to 3 wherein the contiguous sequence is at least 15 
nucleotides. 

5. The pharmaceutical composition according to any of claims 1 to 4 wherein contiguous sequence is at least 20 
nucleotides. 

6. The pharmaceutical composition according to any one of claims 1 to 5 wherein the polynucleotide is selectively 
hybridizable to the genome of HCV-1 . 

7. The pharmaceutical composition according to any one of claims 1 to 6 further comprising a molecule capable of 
causing a scisson in the HCV RNA genome, wherein the molecule is covalently bound or non-covalently attached 
to the polynucleotide. 

8. The pharmaceutical composition according to any one of claims 1 to 7 comprising at least one substituted or altered 
bond between bases. 

9. The pharmaceutical composition according to claim 8 wherein said at least one substituted or altered bond is a 
phosphothionate bond. 

10. A composition comprising a hepatitis C virus (HCV) antisense polynucleotide which is selectively hybridizable to 
the sense strand of the genome of an HCV, wherein the polynucleotide comprises a contiguous sequence of at 
least 8 nucleotides complementary to the sense strand of the genome of an HCV, and an agent which causes viral 
RNA to be inactive, wherein HCV is characterized by: 

(i) a positive stranded RNA genome; 

(ii) said genome comprising an open reading frame (ORF) encoding a polyprotein; and 

(iii) said polyprotein is encoded by an HCV genome where in the HCV genome has an homology at the polypep- 
tide level of at least 40% to the 169 amino acid sequence shown in Figure 11 . 

11. The composition according to claim 10 wherein the contiguous sequence is at least 10 nucleotides. 

12. The composition according to claim 10 or claim 11 wherein the contiguous sequence is at least 12 nucleotides. 

13. The composition according to any of claims 10 to 12 wherein the contiguous sequence is at least 15 nucleotides. 

14. The composition according to any of claims 10 to 13 wherein the agent is non-covalently bound to the antisense 
polynucleotide. 

15. The composition according to any of claims 10 to 13 wherein the agent is covalently bound to the antisense poly- 
nucleotide. 

16. The composition according to any of claims 10 to 1 5 wherein the polynucleotide is selectively hybridizable to the 
genome of HCV-1 . 


34 


EP 1 034 785 A2 


10 


17. A pharmaceutical composition comprising a hepatitis C virus (HCV) antisense polynucleotide selectively hybrid- 
izable to the sense strand of the genome of an HCV, wherein the polynucleotide comprises a sequence of at least 
8 nucleotides corresponding to a region of the sense strand of the genome of an HCV, and a p harm ace utically 
acceptable excipient, wherein HCV is characterized by: 

(i) a positive stranded RNA genome; 

(ii) said genome comprising an open reading frame (ORF) encoding a polyprotein; and 

(iii) said polyprotein is encoded by an HCV genome wherein the HCV genome has an homology at the polypep- 
tide level of at least 40% to the 169 amino acid sequence shown in Figure 11 . 

18. The composition according to claim 17 wherein the polynucleotide sequence is at least 10 nucleotides. 

19. The composition according to claim 1 7 or claim 18 wherein the polynucleotide sequence is at least 12 nucleotides. 

'5 20. The composition according to any of claims 1 7 to 1 9 wherein the polynucleotide is selectively hybridizable to the 
genome of HCV-1 . 

21. A method of preventing HCV viral replication in a system comprising adding to the system the HCV antisense 
polynucleotide according to any one of claims 1 to 20. 

20 
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FIG. I Translation of DNA 12f 

IlePheLysIleArgMetTyrValGlyGlyValGluHisArgLeuGluAlaAlaCysAsn 
1 CCATATTTAAAATCAGGATGTACGTGGGAGGGGTCGAACACAGGCTGGAAGCTGCCTGCA 
GGTATAAATTTTAGTCCTACATGCACCCTCCCCAGCTTGTGTCCGACCTTCGACGGACGT 

TrpThxArgGlyGluArgCysAspLeuGluAspArgAspArgSerGluLeuSerProLeu 
61 ACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGT 
TGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGGCTCGAGTCGGGCA 

LeuLeuThrThrThrGlnTrpGlnValLeuProCysSerPheThrThxLeuProAlaLeu 
121 TACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCC1TCACAACCCTACCAGCCT 
ATGACGACTGGTCATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGATGGTCGGA 

SerThrGlyLeuIleHisLeuHisGlnAsnlleValAspValGlnTyrLeuTyrGlyVal 
181 TGTCCACCGGCCTC ATCC ACCTCCACCAGAACATTGTGG ACGTGCAGTACTTGTACGGGG 
ACAGGTGGCCGGAGTAGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCC 


GlySerSerlleAlaSerTrpAlalleLysTrpGluTyrValValLeuLeuPheLeuLeu 
241 TGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTTCTCCK3TTCCTTC 
ACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAAGAGGACAAGGAAG 


LeuAlaAspAlaArgValCysSerCysLeuTrpMetMetLeuLeuIleSerGlnAlaGlu 
301 TGCTTGCAGACGCGCGCGTC^GCTCCTGCTTGTGGATGATGCTACTCATATCCCAAGCGG 
ACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAGTATAGGGTTCGCC 


AlaAlaI^uGluAsnLeuValIleI^uAsnAlaAlaSerLeuAlaGly*ThrHisGlyLeu 

361 AGGCGGCTTTGG AG AACCTCGTAATACTTAATGCAGCATCCCTGGCCGGG ACGCACGGTC 
TCCGCCGAAACCTC TTGG AGCATTATGAATTACG TCGTAGGGACCGGCCCTGCG TGCC AG 


Val 

421 TTGTATC 
AACATAG 
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FIG. 2r\ Translation of DNA 3c9-l 

GlyCysProGluArgLeaAlaSerCysArgProLeuThrAspPheAspGlnGlyTrpGly 
1 CAGGCTGTCCTGAGAGGCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCTCGG 
GTCCGACAGGACTCTCCGATCGGTCGACGGCTGGGGAATGGCTAAAACTGGTCCCGACCC 

ProIleSerTyrAlaAsnGlySerGlyProAspGlnArgProTyrCysTrpHlsTyrPro 
61 GCCCTATCAGTTATGCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACC 
CGGGATAGTCAATACGGTTGCCTTCGCCGGGGCTGGTCGCGGGGATGACGACCGTGATGG 

ProLysProCysGlylleValProAlaLysSerValCysGlyProValTyrCysPheThr 
121 CCCCAAAACCTTGCGGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCrTCA 
GGGGTTTTGGAACGCCATAACACGGGCGCTTCTCACACACACCAGGCCATATAACGAAGT 

ProSerProValValValGlyThxThxAspArgSerGlyAlaProThxTyrSerTrpGly 
181 CTCCCAGCCCCGTGGTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGG 
GAGGGTCGGGGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGTCGACCC 

GluAsnAspThrAspValPheValLeuAsnAsnThxArgProProLeuGlyAsnTrpPhe 
241 GTGAAAATGATACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGT 
CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnSerThrGlyPheThrLysValCysGlyAlaProProCysVal 
301 TCGGTTGTACCTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTGAGTTGACCTAAGTGGTTTCACACGCCTCGCGGAGGAACAC 

IleGIyGlyAlaGlyAsnAsnThrLeuHisCysProThxAspCysPheArgLysHxsPro 
361 TCATCGGAGGGGCGGGCAACAACACCCTGC ACTGCCCCACTGATTGCTTCCGCAAGCATC 
AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

AspAlaThrTyrSerArgCysGlySerGlyProTrpIleThrProArgCysLeuValAsp 
4 21 CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTCGTCG 
GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCTAGTGTGGGTCCACGGACCAGC 


TyrProTyrArgLeuTrpKisTyrProCysThrlleAsnTyrThrllePheLysIleArg 
481 ACTACCCGTATAGGCTTTGGC ATTATCCTTGTACCATCAACTACACTATATTTAAAATCA 
TGATGGGCATATCCGAAACCGTAATAGGAACATGGTAGTTGATGTGATATAAATTTTAGT 


MetTyrValGlyGlyValGluHisArgLeuGluAlaAlaCysAsnTrpThrArgGlyGlu 
541 GGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCG 
CCTACATGCACCCTCCCCAGCTCGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGC 


ArgCysAspLeuGluAspArgAspArgSerGluI^uSexProLeuLeuLeuThrThrThr 
601 AACGTTGCGATCTGGAAGATAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 
TTGCAACGCTAGACCTTCTATCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAT 


GlnTrpGlnValLeuProCysSerPheThrThrLeuProAlaLeuSerThrGlyLeuIle 
661 CACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTGCCAGCCTTGTCCACCGGCCTCA 
GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGACGGTCGGAACAGGTGGCCGGAGT 


HisLeuHisGlnAsnlleValAspValGlnTyrLeuTyrGlyValGlySexSerlleAla 
721 TCCACCTCCACCAGAACATTGTGGACGTGCAGTACTTGTACGGGGTGGGGTCAAGCATCG 
AGGTGGAGGXGGTCTTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGTTCGTAGC 


SerTrpAlalleLysTrpGluTyrValYall^uLeuPhel^uLeuLeuAlaAspAlaArg 
781 CGTCCTGGGCCATTAAGTGGGAGTACGTCGTCCXCCTGTTCCTTCTGCTTGCAGACGCGC 
GCAGGACCCGGTAATTCACCCTCATGCAGCAGGAGGACAAGGAAGACGAACGTCTGCGCG 
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VaLCysSerCysLeuTrpMetMetLeuLeuIleSerGlnAlaGluAiaAlaLeuGluAsn 
841 GCGTCTGCTCCTGCTTGTGGATGATGCTACTCAXATCCCAAGCGGAAGCGGCTTTGGAGA 
CGCAGACGAGGACGAACACCTACTACGATGAGTATAGGGTTCGCCTTCGCCGAAACCTCT 


LeuVallleLeuAsnAlaAlaSerLeuAlaGlyThrHisGlyLeuValSerPheLeuVal 
901 ACCTCG TAATACTTAATGCAGC ATCCCTG G CCGGG ACG CACGG TCTTGTATCCTTCCTCG 
TGGAGCATTATGAATTACGTCGTAGGGACCGGCCCTGCGTGCCAGAACATAGGAAGGAGC 


PhePheCysPheMaTrpTyrLeuLysGlyLysTrpValProGlyAlaValTyrThrPhe 
961 TGTTCTTCTGCTTTGCATGGTATCTGAAGGGTAAGTGGGTGCCCGG AGCGGTCTACACCT 
ACAAGAAGACGAAACGTACCATAGACTTCCCATTCACCCACGGGCCTCGCCAGATGTGGA 


TyxGlyMetTrpProl^uLeuLeuLeuLeuLeuAlaLeuProGloArgAlaTyrAlaLeu 
1021 TCTACGGGATGTGGCCTCTCCTCCTGCTCCTGTTGGCGTTGCCCCAGCGGGCGTACGCGC 
AGATGCCCTACACCGGAGAGGAGGACGAGGACAACCGCAACGGGGTCGCCCGCATGCGCG' 


AspThrGluValAlaAlaSerCysGlyGlyValValLeuValGlyLeuMetAlaLeuThr 
1081 TGGACACGGAGGTGGCCGCGTCGTGTGGCGGTGTTGTTCTCGTCGGGTTGATGGCGCTAA 
ACCTGTGCCTCCACCGGCGCAGCACACCGCCACAACAAGAGCAGCCCAACTACCGCGATT 


LeuSerProTyrTyrLysArgTyrlleSerTrpCysLeuTrpTrpLeuGlnTyrPheLeu 
114 1 CTCTGTCACCATATTACAAGCGCTATATCAGCTGGTGCTTGTGGTGGCTTCAGTATTTTC 
G AG ACAG TGGTATAATGTTCGCG ATAT AGTCGACCACGAACACCACCG AAG TCATAAAAG 


ThrArgValGluAlaGlhLcuHisValTrpIleProProLeuAsnValArgGlyGlyArg 
1201 TGACCAGAGTGGAAGCGCAACTGCACGTGTGGATTCCCCCCCTCAACGTCCGAGGGGGGC 
ACTGGTCTCACCTTCGCGTTGACGTGCACACCTAAGGGGGGGAGTTGCAGGCTCCCCCCG 


AspAlaVallleLeuLeuMetCysAlaValHisProThrLeuValPheAspIleThrLys 
1261 GCGACGCTGTCATCTTACTCATGTGTGCTGTACACCCGACTCTGGTATTTGACATCACCA 
CGCTGCGACAGTAGAATGAGTACACACGACATGTGGGCTGAGACCATAAACTGTAGTGGT 


LetLLeuLeuAlaValPheGlyProLcuTrpIleLeuGlnAla 
1321 AATTCCTGCTGGCCGTCTTCGG ACCCCTTTGG ATTCTTCAAGCCAG 
TTAACGACGACCGGCAGAAGCCTGGGGAAACCTAAGAAGTTCGGTC 


FIG. 2-2 


38 


EP 1 034 785 A2 


FIG. 3 


GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
1 CGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGC 
GCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCG 


AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
6 1 TGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTT 
ACGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAA 


AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
121 TGCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGC 
ACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCG 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
1 & 1 CAGGGACCAGCTTG AACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCT ACTCCATAG A 
GTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCT 

ProLeuAspLeuProProIle I leGlnArgLeu 
241 ACCACTTGATCTACCTCCAATCATTCAAAGACTC 
TGGTG AACTAG ATG G AGGTTAG TAAGTTTC TG AG 


FIG. 5 

Translation of DNA 26 j 

LeuPheTyrHisHisLysPheAsnSerSerGlyCysProGluArgLeuAlaSerCysArg 
1 GCTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCCTGAGAGGCTAGCCAGCTGCCG 
CGAAAAGATAGTCGTGTTCAAGTTGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGC 

ProLeuThrAspPheAspGlnGlyTrpGlyProIleSerTyrAlaAsnGlySerGlyPro 
61 ACCCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAGTTATGCCAACGGAAGCGGCCC 
TGGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCTTCGCCGGG 

AspGlnArgProTyrCysTrpHisTyrProProLysProCysGlylleValProAlaLys 
121 CGACCAGCGCCCCT ACTGCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCGAA 
GCTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTGGAACGCCATAACACGGGCGCTT 

— Overlap with 131 — 
SerValCysGlyProValTyrCysPheThrProSerProValValVal 

181 GAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGG 
CTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCC 
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FIG. 4 

Translation of DNA 13i 

ProSerProValValvalGlyThrThrAspArgSerGLyAlaProThrTyrSerTrpGly 
1 CTCCCAGCCCCGTGGTGGTGGGAACGACCGACAGGTCGGGCGCGCCTACCTACAGCTGGG 
GAGGGTCGGGGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGATGGATGTCGACCC 

GLuAsnAspThrAspValPheValLeuAsnAsnThrArgProProLeuGlyAsnTrpPhe 
61 G TG AAAA TG ATACGG ACG TC TTCG TCCT T AAC AATACC AGGCCACCGCTG GGC AATTGG T 
CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTA^GGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnSerThrGlyPheThrLysValCysGlyAlaProProCysVal 
121 TCGGTTGTACCTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTGAGTTGACCTAAGTGGTTTCACACGCCTCGCGGAGGAACAC 

IleGlyGlyAlaGlyAsnAsnThrLeuHisCysProThrAspCysPheArgLysHisPro 
181 TCATCGGAGGGGCGGGCAACAACACCCTGC ACTGCCCCACT6ATTGCTTCCGCAAGCATC 
AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

AspAIaThrTyrSerArgCysGlySerGlyProTrpLeuThrProArgCysLeuValAsp 
241 CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGCTCACACCCAGGTGCCTGGTCG 
GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCGAGTGTGGGTCCACGGACCAGC 


TyrProTyrArgLeuTrpHisTyrProCysThrlleAsnTyrThrllePheLysIleArg 
301 ACTACCCGTATAGGCTTTGGCATTATCCTTGTACCATCAACTACACCATATTTAAAATCA 
TGATGGGCATATCCGAAACCGTAATAGGAACATGGTAGTTGATGTGGTATAAATTTTAGT 


MetTyrValGlyGlyValGluHisArgLeuGluAlaAlaCysAsnTrpThrArgGlyGlu 

361 GGATGTACGTGGG AGGGGTCG AGC ACAGGCTGG AAGCTGCCTGCAACTGG ACGCGGGGCG 
CCTACATGCACCCTCCCCAGCTCGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGC 

Overlap with 12 f 

ArgCysAspLeuGluAspAxgAspArgSerGluLeuSerProI^uI^xiX^uThrThrThr 

4 21 AACGTTGCGATCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 
TTGCAACGCTAGACCTTCTGTCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAT 


GlnTrpGlnValLeuProCysSerPheThrThrLeuProAlaLeuSerThrGlyLeu 
481 CACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTGCCAGCCTTGTCCACCGGCCTCA 
GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGACGGTCGGAACAGGTGGCCGGAGT 
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FIG. 6 


Translation of DNA CA59a 

LeuValMetAlaGlnLeuLeuArglleProGlnAlalleLcuAspMetlleAlaGlyAla 
1 TTGGTAATGGCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCT 
AACCATTACCGAGTCGACGAGGCCTAGGGTGTTCGGTAGAACCTGTACTAGCGACCACGA 

HisTrpGlyVaiLeuAlaGlylleAlaTvrPheSerMetValGlyAsnTrpAlaLysVal 
61 CACTGGGGAGTCCTGGCGGGCATAGCGTATTTCTGCATGGTGGGGAACTGGGCGAAGGTC 
GTGACCCCTCAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAG 

LeuValValLeuLeuJLeuPheAlaGiyValAsDAlaGluThxHisValThrGlyGIySer 
121 CTGGTAGTGCTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGT 
GACCATCACGACGACGATAAA.CGGCCGCAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCA 

AlaGlyHisThrValSerGlyPheValSerLeuLeaAlaProGlyAIaLysGlnAsnVal 
181 GCCGGCCACACTGTGTCTGGATTTGTTAGCCTCCTCGCACCAGGCGCCAAGCAGAACGTC 
CGGCCGGTGTGACACAGACCTAAACAATCGGAGGAGCGTGGTCCGCGGTTCGTCTTGCAG 

GlnLeuIieAsnThrAsnGlySerTrpHisLeuAsnSerThrAlaLeuAsnCysAsnAsp 
241 CAGCTGATCAACACCAACGGCAGTTGGCACCTCAATAGCACGGCCCTGAACTGCAATGAT 
GTCGACTAGTTGTGGTTGCCGTCAACCGTGGAGTTATCGTGCCGGGACTTGACGTTACTA 


SerLeuAsnThrGlyTrpLe'aAlaGlyLeuPheTyrHisHisLysPheAsnSerSerGly 
301 AGCCTCAACACCGGCTGGTTGGCAGGGCTTTTCTATCACCACAAGTTCAACTCTTCAGGC 
TCGGAGTTGTGGCCGACCAACCGTCCCGAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCG 

Overlap with 26 j 


Overlap with K9-1 

CysProGluArgLeuAlaSerCysArgPro 
361 TGTCCTG AGAGGCTAGCCAGCTGCCGACCCC 
ACAGGACTCTCCGATCGGTCGACGGCTGGGG 
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Translation of DNA CA84a FIG. 7 

GlnGlyCysAsnCysSerlleTyrProGlyHisIleThrGlyHisArgMetLAlaTrpAs? 

1 CGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCGCATGGCATGGG 
GCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGCGTACCGTACCC 


MetMetMet-AsnTrpSerProThrThrAlaLeuValMetAlaGlnLeuLeuArgllePro 
61 ATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATGGCTCAGCTGCTCCGGATCC 
TATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTACCGAGTCGACGAGGCCTAGG 


GlnAlalleLeuAspHetlleAlaGlyAlaHisTrpGlyValLeuAlaGlylieAlaTyr 
121 CACAAGCCATCTTGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCATAGCGT 
GTGTTCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCTCAGGACCGCCCGTATCGCA 

Overlap with CA59a 

PheSerMetValGlyAsnTrpAlaLysValLeuValValLeuLeuLeuPheAlaGlyVal 
181 ATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGCTATTTGCCGGCG 
TAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCGC 


AspAlaGluThxHisValThrGly 
24 1 TCGACGCGGAAACCCACGTCACCGGGG 
AGCTGCGCCTTTGGGTGCAGTGGCCCC 


Translation of DNA CA156e FIG. 8 

CysTrpValAlaMetThxProThxvalAlaThxArgAspGlyLysLeuProAlaThrGln 
1 GTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGACGCA 
CACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGT 

LeuArgArgHisIleAspLeuLeuValGlySerAlaThrLeuCysSerAlaLeuTyrVal 
61 GCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACGT 
CGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGGGAGATGCA 

GlyAspLeuCysGlySerValPheLeuValGlyGlnLeuPheThxPheSerProArgArg 
121 GGGGGACCTATGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCTCTCCCAGGCG 
CCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGGGTCCGC 

HisTrpThrThrGlnGlyCysAsnCysSerlleTyrProGlyHisIleThrGlyHisArg 
181 CCACTGGACGACGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCG 
GGTGACCTGCTGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGC 

Overlap with CA84a 

MetAlaTrpAspMetMetWetAsnTrpSerProThrThrAlaLeuValValAlaGlnLeu 
241 CATGGCATGGGATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAGTGGCTCAGCT 
GTACCGTACCCTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATCACCGAGTCGA 


LeuArgZleProGlnAla 
301 GCTCCGG ATCCCAC AAGCC 
CGAGGCCTAGGGTGTTCGG 
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FIG. 9 


Translation of DNA CA167b 
SerThrGlyLeuTyrHisValThxAsnAspCysProAsnSerSerlleValTyrGluAla 

1 CTCCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGC 
GAGGTGCCCCGAAATGGTGCAG TGGTTACTAACGGGATTGAGCTCATAACACATGCTCCG 

AlaAspAlalleLeuHisThrProGlyCysValProCysValArgGluGlyAsnAiaSer 
61 GGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAACGCCTC 
CCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTGCGGAG 


ArgCysTrpValAlaMetThrProThrValAlaThrArgAspGlyLysLeuProAlaThr 
121 GAGGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGAC 
CTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTG 

Overlap with CAi56e 

GlnLeuArgArgHisIleAspLeuLeuValGlySerAlaThrLeuCysSerAlaLeuTyr 

181 GCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCTACCCTCTGTTCGGCCCTCTA 
CGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGATGGGAGACAAGCCGGGAGAT 


ValGlyAspLeuCysGlySerValPheLeu 
241 CGTGGGGGACTTGTGCGGGTCTGTCTTTCTTG 
GCACCCCCTGAACACGCCCAGACAGAAAGAAC 
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FIG. IO 

Translation of DNA ssCA216a 

ArgArgArgSerArgAsnLeuGlyLysVallleAspThrl^uThrCysGLyPheAlaAsp 
1 CCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGTGCGGCTTCGCCG 
GGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCACGCCGAAGCGGC 

LeuMetGlyTyrlleProLeuValGlyAlaProLeuGlyGlyAlaAlaArgAlaLeuAla 
61 ACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGG 
TGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAACCTCCGCGACGGTCCCGGGACC 

HisGlyValArgValLeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCys 
121 CGCATGGCGTCCGGGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTT 
GCGTACCGCAGGCCCAAGACCTTCTGCCGCACTTGATACGTTGTCCCTTGGAAGGACCAA 

SerPheSerllePhel^uI^uAlaLeuieuSerCysLeuThrValProAlaSerAlaTyr 
181 GCTCTTTCTCTATCTTCCTTCTGGCCCTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCT 
CGAGAAAGAGATAGAAGGAAGACCGGGACGAGAGAACGAACTGACACGGGCGAAGCCGGA 


GlnValArgAsnSerThrGlyLeuTyrHisValThrAsnAspCysProAsnSerSerlle 
24 1 ACCAAGTGCGC AACTCCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTA 
TGGTTCACGCGTTGAGGTGCCCCGAAATGGTGCAGTGGTTACTAACGGGATTGAGCTCAT 

overlap with CA167b 

ValTyrGluAlaAlaAspAlaIleLeuHisThxProGlyCysValPrc)CysValArgGlu 
301 TTGTGTACGAAGCGGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTG 
AACACATGCTTCGCCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCAC 


GlyAsnAlaSerArgCysTrpValAlaMetThrProThrValAla 
361 AGGGCAACGCCTCGAGGTGTTGGGTGGCGATGACCCCTACGGTGGCC 
TCCCGTTGCGGAGCTCCACAACCCACCGCTACTGGGGATGCCACCGG 
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FIG. II 

Translation of DNA ssCA290a 

LysLysAsnLysArgAsnThrAsnArgArgProGlnAspValLysPheProGlyGlyGly 
1 AAAAAAAAAACAAACGTAACACCAACCGTCGCCCACAGGACGTCAAGTTCCCGGGTGGCG 
TTTTTTTTTTGTTTGCATTGTGGTTGGCAGCGGGTGTCCTGCAGTTCAAGGGCCCACCGC 

GlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeuGlyValArgAla 
61 GTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTGGGTGTGCGCG 
CAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAACCCACACGCGC 

ThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnProIleProLysAla 
121 CGACGAGAAAG ACTTCCGAGCGGTCGCAACCTCGAGGTAGACGCCAGCCTATCCCCAAGG 
GCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCGGTCGGATAGGGGTTCC 

ArgArgProGluGlyArgThrTrpAlaGlnProGlyT^yrProTrpProLeuTyrGlyAsn 
181 CTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCA 
GAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGGGAGATACCGT 

GiuGlyCysGlyTrpAlaGlyTrpLeiiLeuSerProArgGlySerArgProSerTrpGly 
241 ATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGG 
TACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCCGGATCGACCC 


ProThrAspProArgArgArgSerArgAsrlLeuGlyLysVallleAspThrLeuThrCys 

301 GCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 
CGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCA 


GlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeuGlyGlyAlaAla 
361 GCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGAGGCGCTG 
CGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAACCTCCGCGAC 

overlap with CA216a 

ArgAlaLeuAlaHisGlyValArgValLeuGluAspGlyValAsnTyrAlaThrGlyAsn 
421 CCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGAACTATGCAACAGGGA 
GGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTGCCGCACTTGATACGTTGTCCCT 


LeuProGlyCysSerPheSerThrPhe 
481 ACCTTCCTGGTTGCTCTTTCTCTACCTTC 
TGGAAGGACCAACGAGAAAGAGATGGAAG 
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Translation of DNA ag30a 
FIG I 2 ~ | #MetSerValValGlnProProGlyProProLeu 

#MetAlaLeuValOP 

i cgcagaaagcgtctagccatggcgttagtaox;agtgtcgtgcagcctccaggaccccccc 

GCGTCTTTCGCAGATCGGTACCGCAATCATACTCACAGCACGTCGGAGGTCCTGGGGGGG 
ProGLyGluProAM 

61 TCCCGGGAGAGCCATAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGAC 
AGGGCCCTCTCGGTATCACCAGACGCCTTGGCCACTCATGTGGCCTTAACGGTCCTGCTG 

#MetProGlyAspLeuGlyValProProGlnAsp 

121 CGGGTCCTTTCTTGGATCAACCCGCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGA 
GCCCAGGAAAGAACCTAGTTGGGCGAGTTACGGACCTCTAAACCCGCACGGGGGCGTTCT 

OP AM GlyAlaCys 
CysAM * 

181 CTGCTAGCCG AGTAGTGTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTT 
GACGATCGGCTCATCACAACCCAGCGCTTTCCGGAACACCATGACGGACTATCCCACGAA 


GluCysProGlyArgSerArgArgProCysThrMetSerThrAsnProLysProGlnLys 

241 GCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACCTCAAA 
CGCTCACGGGGCCCTCC^GAGCATCl^CACGTGGTACTCGTGCTTAGGATTTGGAGTTT 

Ly sA s nLysAr g AsnThr As nArgArgP r oGlnAs p ValLy s PheP r oG lyG 1 yG lyG 1 n 


301 AAAAAAACAAACGTAACACCAACCGTCGCCCACAGGACGTCAAGTTCCCGGGTGGCGGTC 
TTTTTTTGTTTGCATTGTGGTTGGCAGCGGGTGTCCTGCAGTTCAAGGGC 

IleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeuGlyValArgAlaThr 


361 AGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTGGGTGTGCGCGCGA 
TCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAACCCACACGCGCGCT 

ArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnProlleProLysAlaArg 


421 CGAGAAAGACTTCCG AGCGGTCGCAACCTCGAGGTAGACGTC AGCCTATCCCCAAGGCTC 
GCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGATAGGGGTTCCGAG 

ArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpProLeuTyrGlyAsnGlu 


overlap with CA290a 

4 81 GTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATG 
CAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGGGAGATACCGTTAC 

GlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArgProSerTrpGlyPro 


541 AGGGCTGCGGGTGGGCGGGATGGCTCCTGTCrCCCCGTGGCTCTCGGCCTAGCTGGGGCC 
TCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCCGGATCGACCCCGG 

ThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAspThrLeuThrCysGly 
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601 CCACAGACCCCCGGCGTAGGTCGCGC AATTTGGGTAAGGTCATCG ATACCCTTACGTGCG 
GGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCACGC 

Phe 


661 GCTTC 
CGAAG 


* - Start of long HCV ORF 

I - Putative first amino acid of large HCV polyprotein 

# - Putative small encoded peptides {that may play a 

translational regulatory role) 


FIG. 12-2 
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FIG. 13 

Translation of DNA CA205a 

ValLeuGlyArgGluArgProCysGlyThrAlaOP AM GlyAlaCysGluCysProGly 
1 GTCTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGG 
CAGAACCCAGCGCTTTCCGGAACACCATGACGGACTATCCCACGAACGCTCACGGGGCCC 


ArgSerArgArgProCysThrMetSerThrAsnProLysProGlnArgLysThrLysArg 
61 AGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACCTCAAAGAAAAACCAAACGT 
TCCAGAGCATCTGGCACGTCGTACTCGTGCTTAGGATTTGGAGTTTCTTTTTGGTTTGCA 


AsnThrAsnArgArgProGlnAspValLysPheProGlyGlyGlyGlnlleValGlyGly 
121 AACACCAACCGTCGCCCACAGGACGTCAAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGA 
TTGTGGTTGGCAGCGGGTGTCCTGCAGTTCAAGGGCCCACCGCCAGTCTAGCAACCACCT 


ValTyrLeuLeuProArgArgGlyProArgLeuGlyValArgAlaThrArgLysThrSer 
181 GTTTACTTGTTG CCG CGCAGGGGCCCTAGATTGGGTGTGCGCGCGACGAGAAAG AC TTCC 
CAAATGAACAACGGCGCGTCCCCGGGATCTAACCCACACGCGCGCTGCTCTTTCTGAAGG 

overlap with CA290a 

GluArgSerGlnProArgGlyArgArgGlnProIleProLysAlaArgArgProGluGly 
241 GAGCGGTCGCAACCTCGAGGTAGACGTCAGCCTATCCCCAAGGCTCGTCGGCCCGAGGGC 
CTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGATAGGGGTTCCGAGCAGCCGGGCTCCCG 


ArgThrTrpAlaGlnProGlyTyrProTrpProLeuTyrGlyAsnGluGlyCys 
301 AGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAAXGAGGGCTGCG 
TCCTGGACCCGAGTCGGGCCCATGGGAACCGGGGAGATACCGTTACTCCCGACGC 


* - putative initiator methionine codon 
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FIG. 1 4 

Translation of DNA 18g 


#ProProOP 

#SerThrMetAsnHisSerProValArgAsnTyrCysLeuHisAlaGluSerYalAM Pro 
#LeuHisHisGluSerLeuProCysGluGluLeuLeuSerSerArgArgLysArgLeuAla 
1 CTCCACCATGAATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTAGCC 
GAGGTGGTACTTAGTGAGGGGACACTCCTTGATGACAGAAGTGCGTCTTTCGCAGATCGG 


#MetSerValValGlnProProGlyProProLeuProGlyGluProAM 
MetAlaLeuValOP 

61 ATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGT 
TACCGCAATCATACTCACAGCACGTCGGAGGTCCTGGGGGGGAGGGCCCTCTCGGTATCA 


121 GGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGG ACG ACCGGGTCCTTTCTTGG ATC 
CCAGACGCCTTGGCCACTCATGTGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCTAG 


overlap with ag30a 

#MetProGlyAspLeuGlyValProProGlnAspCysAM 

181 AACCCGCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGTGT 
TTGGGCGAGTTACGGACCTCTAAACCCGCACGGGGGCGTTCTGACGATCGGCTCATCACA 


OP AM GlyAlaCysGluCysProGlyArgSer 
* 

241 TGGGTCGCGAAAGGCCTTGTGG TACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGT 
ACCCAGCGCTTTCCGGAACACCATGACGGACTATCCCACGAACGCTCACGGGGCCCTCCA 


ArgArg 


301 CTCGTAGA 
GAGCATCT 


* - Start of long HCV ORF 

# * Putative small encoded peptides (that may 

play a translational regulatory role) 
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FIG. 1 5 


Translation of DNA 16 jh 

Overlap with 15e 

GlyAlaCysTyrSerlleGluProLeuAspLeuProProIlelleGlnArgLeuHisGly 
1 GGGGCCTGCTACTCGATAGAACCACTGGATCTACCTCCA^ 

CCCCGG ACGATGAGGTATCTTGGTG ACCTAGATGGAGG TTAGTAAG TTTCTG AGGTACCG 

LeuSerAlaPheSerLeuHisSerTyrSerProGlyGluIleAsnArgValAlaAlaCys 
6 1 CTCAGCGCATTTTCACTCCACAGTTACTCTCCAGGTGAAATTAATAGGGTGGCCGCATGC 
GAG TCG CG TAAAAG TG AGG TGTCAATG AG AGGTCCACT TTAATTATCCCACCGG CGTACG 

Gly* 
G 

LeuArgLysLeuGlyValProProLeuArgAlaTrpArgHisArgAlaArgSerValArg 
121 CTCAGAAAACTTGGGGTACCGCCCXTGCGAGCTTGGAG ACACCGGGCCCGGAGCGTCCGC 
GAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCG 

AlaArgl^uI^uMaArgGlyGlyArgAlaMalleCysGlyLysTyr^ 
181 GCTAGGCTTCTGGCCAGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGG 
CGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACC 

AlaValArgThrLysLeuLys 
241 GCAGTAAGAACAAAGCTCAAAC 
CG TC ATTCTTG TTTC G AGTTTG 


* - nucleotide heterogeneity 
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COMBINED ORF OF DNAs pil4a THROUGH 15e . 

PIG 

(pil4a/CA167b/CA156e/CA84a/CA59a/K9'l/12f/14i/llb/7 f/7e/ 
8h/33c/4Gb/3 7b/3 5/36/8 1/3 2/33b/2Sc/l4c/8f/3 3f/33g/39c/ 
35f/19g/26g & 15e) 


ArgSerArgAsnLeuGlyLysVaillc^spThrLeuThrCysGlyPheAlaAspLeuMet 
1 AGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGTGCGGCXTCGCCGACCTCATG 
TCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCACGCCGAAGCGGCTGGAGTAC 

GlyTyrlleProLeuValGlyAlaProLeuGlyGiyAiaAlaArgAlaLeuAlaHlsGly 
61 GGGTACATACCGCTCGTCGGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGGCGCATGGC 
CCCATGTATGGCGAGCAGCCGCGGGGAGAACCTCCGCGACGGTCCCGGGACCGCGTACCG 

ValArgValLeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGiyCysSfirPhe 
121 GTCCGGGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTC 
CAGGCCCAAGACCTTCTGCCGCACTTGATACGTTGTCCCTTGGAAGGACCAACGAGAAAG 

SerllePheLeuLeuAlaLeuIreuSerCysLeuThrValProAlaSerAlaTyrGlnVal 
181 TCTATCTTCCTTCTGGCCCTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCTACCAAGTG 
AGATAGAAGGAAGACCGGGACGAGAGAACGAACTGACACGGGCGAAGCCGGATGGTTCAC 

ArgAsnSerThrGlyLeuTyrHisValThrAsaAspCysProAsnSerSerlleValTyr 

241 cgcaactccacggggctttaccacgtcaccaatgattgccctaactcgagtattgtgtac 
gcgttgaggtgccccgaaatggtgcagtggttactaacgggAttgagctcataacacatg 

GluAlaAlaAspAlalleLeuHisThrProGlyCysValProCysValArgGluGlyAsn 
301 GAGGCGGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAAC 
CTCCGCCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTG 

AlaSerArgCysTrpValAlaMetThrProThrValAlaThrArgAspGlyl^ysLeuPro 
361 GCCTCGAGGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCC 
CGGAGCTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGG 

AlaThrGlnLeuArgArgHisIleAspLeuLeuValGlySerAlaThrLeuCysSerAla 
4 21 GCGACGCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCC 
CGCTGCGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGG 

LeuTyrValGlyAspLeuCysGlySerValPheLeuValGlyGlnLeuPheThrPheSer 
4 81 CTCTACGTGGGGGACCTATGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCTCT 
GAGATGCACCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGA 

ProArgArgHisTrpThrThrGlnGlyCysAsnCysSerlleTyrProGlyHisIleThx 
541 CCCAGGCGCCACTGGACGACGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACG 
GGGTCCGCGGTGACCTGCTGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGC 

GlyHlsArgMetAlaTrpAspMetMetMetAsnTrpSerProThxThrAlaLeuVaiMet 
601 GGTCACCGCATGGCATGGGATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATG 
CCAGTGGCGTACCGTACCCTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTAC 

AlaGlnLeuLeuArgrieProGlnAlalleLeuAspMetlleAlaGlyAlaHisTrpGly 
661 GCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCTCACrGGGGA 
CGAGTCGACGAGGCCTAGGGTGrrCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCt 

ValLeuAlaGlyllaAlaTyrPheSerMctValGlyAsnTrpAlaLysValLeuValVal 
721 GTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTG 
CAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCAC 

LeuLeuLeuPheAlaGlyValAspAlaGluThrHisValThrGlyGlySerAlaGlyHis 
781 CTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCAC 
GACGACGATAAACGGCCGCAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCACGGCCGGTG 
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FIG. I6-2 

ThrYalSerGlyPheValSerLeuLeuAlaProGIyAlaLysGlnAsnValGlnLeuIle 
841 ACTGTGTCTGGATTTGTTAGCCTCCTCGCACCAGGCGCCAAGCAGAACGTCCAGCTGATC 
TGACACAGACCTAAACAATCGGAGGAGCGTGGTCCGCGGTTCGTCTTGCAGGTCGACTAG 

AsnThrAsnGlySerTrpHisLeuAsnSerThrAlaLeuAsnCysAsnAspSerLeuAsn 
901 AACACCAACGGCAGTTGGCACCTCAATAGCACGGCCCTGAACTGCAATGATAGCCTCAAC 
TTGTGGTTGCCGTCAACCGTGGAGTTATCGTGCCGGGACTTGACGTTACTATCGGAGTTG 

ThrGlyTrpLeuAlaGlyLeuPheTyrHisKisLysPheAsnSerSerGlyCysProGlu 
961 ACCGGCTGGTTGGCAGGGCTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCCTG AG 
TGGCCGACCAACCGTCCCGAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCGACAGGACTC 

ArgLeuAlaSerCysArgProLeuTKrAspPheA.spGlnGlyTrpGlyProIleSerTyr 
1021 AGGCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAGTTAT 
TCCGATCGGTCGACGGCTGGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATA 

AlaAsnGlySerGlyProAspGlnArgProTyrCysTrpHisTyrProProLysProCys 
1081 GCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGC 
CGGTTGCCTTCGCCGGGGCTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTGGAACG 

GlylleValProAlaLysSerValCysGlyProValTyrCysPheThrProSerProVal 
1141 GGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTG 
CCATAACACGGGCGCTTCTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCAC 

ValValGlyThrThrAspArgSerGlyAlaProThrryrSerTrpGlyGluAsnAspThr 
1201 GTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATACG 
CACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGTCGACCCCACTTTTACTATGC 

AspValPheValLeuAsnAsnThrArgPrcProLeuGlyAsnTrpPheGlyCysThrTrp 
1261 GACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACCTGG 
CTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCAAGCCAACATGGACC 

MetAsnSerThrGlyPheThrLysValCysGlyAlaProProCysVallleGlyGlyAla 
13 21 ATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTGTCATCGGAGGGGCG 
TACTTGAGTTGACCTAAGTGGTTTCACACGCCTCGCGGAGGAACACAGrAGCCTCCCCGC 

GlyAsnAsnThrLeuHisCysProThrAspCysPheArgLysHisProAspAlaThrTyr 

1 3 8 i GGCAAC AACACCCTGCACTGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCC ACATAC 

CCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAGGCCTGCGGTGTATG 

SerArgCysGlySerGlyProXrpIleThrProArgCysLeuValAspTyrProTyrArg 

14 41 TCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTGGTCGACTACCCGTATAGG 

AGAGCCACGCCGAGGCCAGGGACCTAGTGTGGGTCCACGGACCAGCTGATGGGCATATCC 

LeuTrpHi sTy rProCy sThr 1 1 eAsnTy rThr I lePheLy s I leArgHetTy rValGl y 
1501 CTTTGGCATTATCCTTGTACCATCAACTACACCATATTTAAAATCAGGATGTACGTGGGA 
G AAACCGTAATAGG AACATGG TAG TTG ATG TGG TATAAATTTTAGTCCT ACATGCAC CC T 

GlyValGlyHisArgLeuGluAlaAlaCysAsnTrpThrArgGlyGluArgCysAspI^eu 
1561 GGGGTCGAACACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGATCTG 
CCCCAGCTTGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGAC 

GluAspArgAapArgSerGlul^uSerProI^uLeuLeuThrThrThrGlnTrpGlnVal 
1621 GAAG ACAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTC 
CTTCTGTCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGATGTGTCACCGTCCAG 

LeuProCy sSer PheThrThrLeuProAlaLeuS erThrGlyLeu I leHisLeuHi sGl n 
1681 CTCCCGTGTTCCTTCACAACCCTACCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAG 
GAGGGCACAAGGAAGTGTTGGGATGGTCGGAACAGGTGGCCGGAGTAGGTGGAGGTGGTC 

AsnlleValAspValGlnTyrLeuTyrGlyValGlySerSerlleAlaSerTrpAlalle 
1741 AACATTGTGGACGTGCAGTACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGCCATT 
TTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGTTCGTAGCGCAGGACCCGGTAA 

Ly sTr pGl uTy rVa IValXeuLeuP hel^euI^uLeuAl aAsp AlaArgVa ICy s S e rCy s 
1801 AAGTGGGAGTACGTCGTTCTCCTGTTCCTTCTGCTTGCAGACGCGCGCGTCTGCTCCTGC 
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FIG. I6-3 

rrCACCCTCATGCAGCAAGAGGACAAGGAAGACG^ACGTCTGCGCGCGCAGACGAGGACG 

LeuTrpMetMetLeuLeuIleSerGlnAlaGluAlaAlaLeuGluAsnLeuVallleLeu 
1861 TTGTGGATGATGCTACTCATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTT 
AACACCTACTACGATGAGTATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAA 

AsnAlaAlaSerLeuAlaGlyThrHlsGlyLeuValSerPheLeuValPhePheCysPhe 
19 21 AATGCAGCATCCCTGGCCGGGACGCACGGTCTTGTATCCTTCCTCGTGTTCTTCTGCTTT 
TTACGTCGTAGGGACCGGCCCTGCGTGCCAGAACATAGGAAGGAGCACAAGAAGACGAAA 

AlaTrpTyrLcuLysGlvLysTrpValPrcGiyAlaValTyrThrPheTyrGlyMetTro 

1981 GCATGGTATTTGAAGGGTAAGTGGGTGCCCGGAGCGGTCTACACCTTCTACGGGATGTGG 
CGTACCATAAACTTCCCATTCACCCACGGGCCTCGCCAGATGTGGAAGATGCCCTACACC 

P roLe u Leul.e uLe uLe uLe uA 1 a Le u P r oG 1 nA rg Al aTy r Al a Le uAs pT hr G 1 u Va 1 
2041 CCTCTCCTCCTGCTCCTGTTGGCGTTGCCCCAGCGGGCGTACGCGCTGGACACGGAGGTG 
GG AG AGGAGGACGAGGACAACCGCAACGGGGTCGCCCGCATGCGCGACCTGTGCCTCC AC 

AlaAlaSerCysGlyGlyValValLeuValGlyLeuMetAlaLeuThrLeuSerProTyr 

2 10 1 GCCGCGTCGTGTGGCGGTGTTGTTCTCGTCGGGTTGATGGCGCTGACTCTGTCACCATAT 
CGGCGCAGCACACCGCCACAACAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATA 

TyrLysArgTyrlleSerTrpCysLeuTrpTrpLeuGlnTyrPhel/euThrAxgValGlu 
2161 TACAAGCGCTATATCAGCTGGTGCTTGTGGTGGCTTCAGTATTTTCTGACCAGAGTGGAA 
ATGTTCGCGATATAGTCGACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTT 

AlaGlnLeuHisValTroIleProProLeuAsnValArgGlyGlyArgAspAlaVallle 
2221 GCGCAACTGCACGTGTGGATTCCCCCCCTCAACGTCCGAGGGGGGCGCGACGCCGTCATC 
CGCGTTGACGTGCACACCTAAGGGGGGGAGTTGCAGGCTCCCCCCGCGCTGCGGCAGTAG 

LeuLeuMe tCy s AlaVa 1 Hi sP roThrLeuVa 1 P heAs p I leThrLy sLeuLeuLeuAl a 
2281 TTACTCATGTGTGCTGTACACCCGACTCTGGTATTTG ACATCACCAAATTGCTGCTGGCC 
AATGAGTACACACGACATGTGGGCTGAGACCATAAACTGTAGTGGTTTAACGACGACCGG 

ValPheGlyProLeuTrpIleLeuGlnAlaSerLeuLeuLysValProTyrPheValArg 

2341 GTCTTCGGACCCCTTTGGATTCTTCAAGCCAGTTTGCTTAAAGTACCCTACTTTGTGCGC 
CAGAAGCCTGGGGAAACCTAAGAAGTTCGGTCAAACGAATTTCATGGGATGAAACACGCG 

valGlnGlyLeuLeuArgPheCysAlaLeuAlaArgLysMetlleGlyGlyHisTyrVal 
24 01 GTCCAAGGCCTTCTCCGGTTCTGCGCGTTAGCGCGGAAGATGATCGGAGGCCATTACGTG 
CAGGTTCCGGAAGAGGCCAAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGTAATGCAC 

GlnMetValllelleliysLeuGlyAlaLeuThrGlyThrTyrValTyrAsnHisLeuThr 
2461 CAAATGGTCATCATTAAGTTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACT 
GTTTACCAGTAGTAATTCAATCCCCGCGAATGACCGTGGATACAAATATTGGTAGAGTGA 

ProLeuArgAspTrpAlaHisAsnGlyLeuArgAspLeuAlaVaLAlaValGluProVal 
2521 CCTCTTCGGGACTGGGCGCACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTC 
GGAGAAGCCCTGACCCGCGTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAG 

ValPheSerGInMetGluThrLysLeuIleThrTrpGlyAiaAspThrAlaAlaCysGly 
2581 GTCTTCTCCCAAATGGAGACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGT 
CAGAAGAGGGTTTACCTCTGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCA 

A s p 1 1 e 1 1 eAs nG 1 y LeuP roValS er Ala Ar g Ar gG ly ArgG 1 u II eLeuLeuG 1 y P ro 
2641 G ACATC ATCAACGGCTTGCCTGTTTCCGCCCGCAGGGGCCGGGAG ATACTGCTCGGGCCA 
CTGTAGTAGTTGCCGAACGGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGT 

AlaAspGlyMetValSerLysGlyTrpArgLeuLeuAlaProIleThrAlaTyrAlaGln 
2701 GCCGATGGAATGGTCTCCAAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTACGCCCAG 
CGGCTACCTTACCAGAGGTTCCCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTC 

GXnThrArgGlyLeuLeuGlyCysIlelleThrSerLeuThrGlyArgAspIiysAsnGln 
2761 CAGACAAGGGGCCTCCTAGGGTGCATAATCACCAGCCTAACTGGCCGGGACAAAAACCAA 
GTCTGTTCCCCGGAGGATCCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTT 
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FIG. I6-4 

ValGluGlyGluValGlnlleValSerThrAlaAIaGlnThrPheLeuAlaThrCysIle 
2B21 GTGGAGGGTGAGGTCCAGATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACGTGCATC 
CACCTCCCACTCCAGGTCTAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAG 

AsnGlyValCyaTrpThrValTyrHisGlvAlaGlyThrArgThrlleAlaSerProLys 
2881 AATGGGGTGTGCTGGACTGTCTACCACGGGGCCGGAACGAGGACCATCGCGTCACCCAAG 
TTACCCCACACGACCTGACAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTTC 

GlyProVaIlleGlnMetTyrThrAsnValAspGlnAipL«uValGlyTrpProAlaPro 
2941 GGTCCTGTCATCCAGATGTATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCG 
CCAGGACAGTAGGTCTACATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGC 

GlnGlySerArgSerLeuThrProCysThrCysGlySerSerAspLeuTyrLeuValThr 
3001 CAAGGTAGCCGCTCATTGACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACG 
GTTCCATCGGCGAGTAACTGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGC 

ArgHisAlaAspVallleProValArgAxgArgGlyAspSerArgGlySerLeuLeuSer 
3061 AGGCACGCCGATGTCATTCCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCG 
TCCGTGCGGCTACAGTAAGGGCACGCGGCCGCCCCACTATCGTCCCCGTCGGACGACAGC 

ProArgProIleSerryrLeuLysGlySerSerGlyGlyProLeuLeuCysProAlaGly 
3121 CCCCGGCCCATTTCCTACTTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGG 
GGGGCCGGGTAAAGGATGAACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCC 

HisAlaValGlyllePheArgAlaAlaValCysThrArgGlyValAlaLysAlaValAsD 
3181 CACGCCGTGGGCATATTTAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAAGGCGGTGGAC 
GTGCGGCACCCGTATAAATCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCCACCTG 

PhelleProValGluAsnLeuGIuThrThrMetArgSerProValPheThrAspAsnSer 
3241 TTTATCCCTGTGGAGAACCTAG AGACAACC ATGAGGTCCCCGGTGTTCACGG ATAACTCC 
AAATAGGGACACCTCTTGGATCTCTGTTGGTACTCCAGGGGCCACAAGTGCCTATTGAGG 

SerProProValValPrcjGInSerPheGlnValAlaHisLeuHlsAlaProThrGlySer 
3 301 TCTCCACCAGTAGTGCCCCAGAGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGC 
AGAGGTGGTCATCACGGGGTCTCGAAGGTCCACCGAGTGGAGGTACGAGGGTGTCCGTCG 

GlyLysSerThrLysValProAlaAlaTyrAleiAlaGlnGlyTyrLysValLeuValLeu 
3361 GGCAAAAGCACCAAGGTCCCGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTC 
CCGTTTTCGTGGTTCCAGGGCCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAG 

AsnProSerValAlaAlaThrLeuGlyPheGlyAlaXyrMetSerLysAlaHisGXylle 
34 21 AACCCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATC 
TTGGGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAG 

AspProAsnlleArgThxGlyVaiArgThrlleThrTKrGlySerProIleThrTyrSer 
34 81 . G ATCCTAACATCAGG ACCGGGGTGAG AACAATTACCACTGGCAGCC CCATCACG TACTCC 
CTAGGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGG 

ThrTyrGlyLysPheLeuAlaAspGlyGlyCysSerClyGlyAlaTyrAspIlelleTle 
3541 ACC TACGGCAAG TTCC TTGCCG ACGGCGGGTGCTCGGGGGGCGCTTATG ACATAATAATT 
TGGATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAA 

CysAspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAsp 
3601 TGTGACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATCGGCACTGTCCTTGAC 
ACACTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAGCCGTGACAGGAACTG 

GlnAlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAxaThrProProGlySer 
3661 CAAGCAGAG ACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCC 
GTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGG 

ValThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIlePro 
37 21 GTCACTGTGCCCCATCCCAACATCGAGG AGGTTGCTCTGTCCACCACCGGAGAG ATCCCT 
CAGTGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGA 

PheTyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCys 
3781 TTTTACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGT 
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FIG. I6-5 

AAAATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACA 

HisSerLysLysLysCysAspGlaLeuAlaAlaLysLeuValAlaLeaGlyrieAsnAla 
3841 CATTCAAAG AAGAAGTGCGACG AACTCGCCGCAAAGCTGGTCGCA7TGGGCATC AATGCC 
GTAAGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGG 

ValAlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValVal 
3901 GTGGCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTC 
CACCGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCXACAACAGCAG 

ValAlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCys 
3961 GTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGC 
CACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACG 

AsnThrCysValThrGlnTKrValAspPheSerLeuAspProThrPheThrlleGluThr 
4 021 AATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACA 
TTATGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGT 

IleThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlv 
4081 ATCACGCTCCCCCAGGATGCTGTCTCCCGCACTCAACGrCGGGGCAGGACTGGCAGGGGG 
TAGTGCGAGGGGGTCCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCGCCC 

LysProGlylXeTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSer 
4141 AAGCCAGGCATCTACAGATTTGTGGCACCGGGGG AGCGCCCCTCCGGCATGTTCGACTCG 
TTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGC 

SerValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyxGluLeuThrProAlaGlu 
4 20i TCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAG 
AGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTC 

ThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHis 
4 261 ACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCAT 
TGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTA 

LeuGluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHtsPheLeuSer 
4 321 CTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCC 
GAACTTAAAACCCXCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGG 

GlnThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCys 
4381 CAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGC 
GTCTGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACG 

AlaArgAlaGInAlaProProProSerTrpAspGlriHetTrpLysCysLeuIleArgLeu 
4441 GCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTC 
CGATCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAG 

LysProThrL^uHisGlyProThrProLeuLeuTyrArgLcuGlyAlaValGlnAsnGlu 
4501 AAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAA 
TTCGCGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTT 

IleThrLeuThrHisProValThrLysTyrlleMetThrCysMatSerAlaAspLeuGlu 
4 561 ATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAG 
TAGTGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTC 

ValValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCys 
4 621 GTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGC 
CAGCAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACG 

LeuSerThrGlyCysValVailleValGlvArgValValLeuSerGlyLvsProAlalle 
4 6 81 CTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATC 
GACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAG 

IleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHis 
4 741 ATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCAC 
TATGGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTG 
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FIG. 16-6 

Le-ProTyrlleGluGlnGlyMetHetLeuAlaGluGlnPheLysGlnLysAlaLeuGiy 
4 801 TTACCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGC 
AATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCG 

LeuLeuGlnThrAiaSerArgGInAlaGluValXieAlaProAlaValGlnThrAsnTrp 
4 861 CTCCTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGG 
GAGGACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACC 

GlnLysLeuGiuThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyr 
4 921 CAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATAC 
GTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATG 

LeuAlaGlyLeuSerThrLcuProGlyAsnProAlalleAiaSerLeuMetAlaPheThr 
4 981 TTGGCGGGCTTGTC AACGCTGCCTGGTAACCCCGCCATTGCTTCATTG ATGGCTTTTACA 
AACCGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGT 

AlaAlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGly 
5041 GCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGG 
CGACGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCC 

TrpValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThxAlaPheValGlyAlaGlyLeu 
5101 TGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTA 
ACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAAT 

AlaGlyAlaAlalleGlySerValGlvLeuGlyLysValLeuIleAspIleLeuAlaGly 
5161 GCTGGCGCCCCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGG 
CGACCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCC 

TyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValPro 
5221 TATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCC 
ATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGG 

S erThrG 1 uA s pLe uVa 1 As nLeuLeuP roAl a 1 1 eLeuS er P roG lyAl aLeuVa IVa 1 
52 81 TCCACGG AGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTC 
AGGTGCCTCCTGG ACCAGTTAGATGACGGG CGGTAGGAGAGCGGGCCTCGGG AG CATC AG " 

GlyValYalCysAlaAlalleLeuArgArgHlsValGlyProGlyGluGlyAlaValGln 
5341 GGCGTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAG 
CCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTC 

TrpMetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyr 
5401 TGGATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTAC 
ACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATG 

ValProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThr 
5461 GTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACC 
CACGGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGG 

GlnLeuL^uArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGly 
5521 CAGCTCCTG AGGCG ACTGCACC AGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGT 
GTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCA 

SerTrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrp 
5581 TCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGrTGAGCGACTTTAAGACCTGG 
AGGACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTCAAATTCTGGACC 

LeuLysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGly 
5641 CTAAAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGG 
GATTTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCC 

TyrLysGlyValTrpArgValAspGlylleMetKisThrArgCysHisCysGlyAlaGiu 
5701 TATAAGGGGGTCTGGCGAGTGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAG 
ATATTCCCCCAGACCGCTCACCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTC 

IleThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsn 
5761 ATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAAC 


56 


EP 1 034 785 A2 


FIG. I6-7 

TAG rGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTG 
MetTrpSerGiyThrPheProIleAsnAlaTyrThrThrGlyPrcCysThrPrcLeuPro 

5821 atgtggagtgggaccttccccattaatgcctacaccacgggcccctgtaccccccttcct 
tacacctcaccctggaaggggtaattacggatx;tggtgcccggggacatggggggaagga 

AlaProAsnTyrThrPheAlaLeuTrpAraValSerAlaGluGluTyrValGluIleArg 
5881 GCGCCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGMTATGTGGAGATAAGG 
CGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATACACCTCTATTCC 

GlnVaiGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCys 
5941 CAGGTGGGGGACnCCACTACGTGACGGGTATGACTACTGACAATCTCAAATGCCCGTGC 
GTCCACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAG'AGTTTACGGGCACG 

GInValProSerProGluPhePheThrGluLeuAspGlyVaiAxgLeuHisArgPheAla 
6001 CAGGTCCCATCGCCCGAATTrTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCG 
GTCCAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGC 

ProProCysLysProLexiLeuArgGluGluValSerPheArgValGlyLeuHisGluTyr 
6061 CCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATAC 
GGGGGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATG 

ProValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMet 

6121 CCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATG 
GGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTAC 

LeuThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySer 
6181 CTCACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCA 
GAGTGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCGCTAGT 

ProProSerValAlaSerSerSerAlaSerGlnLeuSerAIaProSerLeuLysAlaThr 
6241 CCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACT 
GGGGGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGA 

CysThrAlaAsnHi^spSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTroArg 

6301 TGCACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGG 
ACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCC 

GlnGluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValValllel^uAsp 
6361 CAGGAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGAC 
GTCCTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTG 

SerPheAspProLeuValAlaGluGluAsoGluArgGluIleSerValProAlaGluIle 
64 21 TCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATC 
AGGAAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAG 

Le uArg Ly s SerAr g Ar gp heA 1 aG 1 nA 1 aLeuPr oVa ITrpAl aAr g P roAspTy rAs n 
64 81 CTGCGG AAGTCTCGG AGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAAC 
GACGCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTG 

ProProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCys 

6541 CCCCCGCTAGTGGAG ACGTGG AAAAAGCCCGACTACG AACCACCTG TGG TCCATGGCTG T 
GGGGGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACA 

ProLeuProProProLysSerProProValProProProArgLysLysArgThrValVal 
6601 CCGCTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTC 
GGCGAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAG 

I^uThrGluSerThrl^uSerThrAlal^uAlaGluL&uAlaThrArgSerPheGlySer 
6661 CTCACTGAATCAAC CCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGC 
GAGTGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCG 

SerSerThrSerGlylleThrGlyAspAsnThrThr^hrSerSerGluProAlaProSer 
6721 TCCTCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCT 
AGGAGTTGAAGGCCGTAATGCCCGCTGTTATGCTGXTGTAGGAGACTCGGGCCGGGAAGA 
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FIG. 1 6-8 

GlyCysProProAspSerAspAlaGluSerTyrSerSerMetProProI^uGluGlyGlu 
6781 GGCTGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAG 
CCGACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTC 

ProGlyAspProAspLeuSerAspGlySerTrpScrThrValSerSerGluAlaAsnAla 

6 841 CCTGGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCG 

GGACCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGC 

GluAspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCys 
6901 GAGGATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGC 
CTCCTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACG 

AlaLAlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHis 
6961 GCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCAC 
CGGCGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTG 

AsnLeuvalxyrSerThrThrSerArgSerAlaCysGlnArgGlaLysLysValThrPhe 
7021 AATTTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTT 
TTAAACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAA 

AspArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAla 
7081 GACAGACTGCAAGTTCTGG ACAGCCATTACCAGG ACGTACTCAAGG AGGTTAAAGCAGCG 
CTGTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGC 

AlaSerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProPro 
7141 GCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCA 
CGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGG? 

HisSerAlaLysSerLysPheGlyTyrGlyAlaLysAspvalArgCysHisAlaArgLys 

7201 CACTCAGCCAAATCC^AGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAG 
GTGAGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTC 

AlaValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIie 

7 261 GCCGTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATA 

CGGCATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTAT 

■ 

AspThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArg 
7 321 GACACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGT 
CTGTGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCA 

LysProAlaArgLeulleValPheProAspLeuGlyValArgValCysGluLysMetAla 
73 81 AAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCT 
TTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGA 

LeuTyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGln 
7441 TTGT ACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAA 
AACATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTT 

TyrSerPrpGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrPro 

7501 TACTCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCA 
ATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGT 

MetGlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArg 
7561 ATGGGG TTCTCG TATGATACCCGCTGCTTTG ACTCCACAGTCACTG AG AGCGAC ATCCGT 
TACCCC AAG AGCATACTATGGG CGACG AAACTGAGGTGTCAG TG ACTCTCGCTG TAGGCA 

ThrGluGluAlalleTyrGlnCysCyaAspLeuAepProGlnAlaArgValAlalleLys 
7621 ACGGAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAG 
TGCCTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTC 

SerLeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCys 
7681 TCCCTC ACCGAGAGG CTTTATG TTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGC 
AGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACG 

GlyTyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThr 
7741 GGCTArCGCAGGtGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACT 
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FIG. I6-9 

CCGATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGA 

CysTyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeu 
7801 TGCTACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTC 
ACGATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAG 

ValCysGlyAspAspLeuVaiVallleCysGluSerAlaGlyValGlnGluAspAlaAla 
7 861 GTGTGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCG 
CACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGC 

SerLeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProPro 
7921 AGCCTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCA 
TCGGACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGT 

GlnProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHis 
7 981 CAACCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCAC 
GTTGGXCTTATGCTGAACCTCGAGTATTGXAGTACGAGGAGGTTGCACAGTCAGCGGGTG 

AspGlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArg 
8041 GACGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGA 
CTGCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCT 

AlaAlaTrpGluThrAlaArgHisThrProYalAsnSerTrpLeuGlyAsnllelleMet 
8101 GCTGCGTGGGAGACAGCAAGAC ACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATG 
CGACGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTAC 

PheAlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeulle 
8161 TTTGCC CCCACACTGTGGG CG AGG ATG ATACTG ATG AC CC ATTTCTTTAGCG TC CTTA T A 
AAACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATAT 

AlaArgAspGlnl^euGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlle 
8221 GCCAGGGACCAGCTTGAACAGGCCCTCGATTGCGAG ATCTACGGGGCCTGCTACTCCATA 
CGGTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTAT 

GluProLeuAspLeuProProIlelleGlnArgLeu 
8281 GAACCACTTG ATCTACCTCCAATCATTCAAAGACTC 
CTTGGTGAACTAGATGGAGGTTAGTAAGTTTCTGAG 
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-319 CACTCC^CCATGAATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTAG 
GTGAGGTGGTACTTAGTGAGGGGACACTCCTTGATGACAGAAGTGCGTCTTTCGCAGATC 

-259 CCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA 
GGTACCGCAATCATACTCACAGCACGTCGGAGGTCCTGGGGGGGAGGGCCCTCTCGGTAT 

-199 GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA 
CACCAGACGCCTTGGCCACTCATGTGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCT 

-139 TCAACCCGCTCAATGCCTGGAG ATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGT 
AGTTGGGCGAGTTACGGACCTCTAAACCCGCACGGGGGCGTTCTGACGATCGGCTCATCA 

-79 GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG 
CAACCCAGCGCTTTCCGGAACACCATGACGGACTATCCCACGAACGCTCACGGGGCCCTC 

- 19 GTCTCGTAGACCGTGCACC 
CAG AG CATCTGGCACG TGG 

Arg Thr 

MetSerThrAsnProLysProGJjiLysLysAsnLysArgAsriThrAsriArgArgProGln 
1 ATGAGCACGAATCCTAAACCTCAAAAAAAAAACAAACGTAACACCAACCGTCGCCCACAG 
XACTCGTGCTTAGGATTTGGAGTTTOTTTTTTGTTTGCATTGTGGTTGGCAGCGGGTGTC 

AspValLysPheProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArg 
6 1 G ACGTC AAG TTCCCGGGTGGCGGTCAG ATCGTTGGTGG AGTTTACTTG TTGCCGCGC AGG 
CTGCAGTTCAAGGGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCC 

GlyProArglieuGlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGly 
121 GGCCCTAGATTGGGTGTGCGCGCGACG AGAAAGACTTCCGAGCGGTCGCAACCTCG AGGT 
CCGGGATCTAACCCACACGCGCGCTCCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCA 

ArgArgGlnProIleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGly 
181 AGACGTCAGCCTATCCCCAAGGCTCGTCGGCCCGAGGGCAGG ACCTGGGCTC AGCCCGGG 
TCTGCAGTCGGATAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCC 

TyrProTrpProLeuTyrGlyAsaGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerPro 
241 TACCCTrcGCCCCTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCC 
ATGGGAACCGGGGAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGG 

ArgGlySerArgProSerTrpGlyProThrAspProAxgArgArgSerArgAsnLeuGly 
301 CGTGGCTCTCGGCCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGT 
GCACCGAGAGCCGGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCA 

Ly SVal I leAspThr LeuThr Cy sGl y Phe AlaAspLeuMetGl yTyr IleProLeuVa 1 
361 AAGGTC ATCGATACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGrC 
TTCCAGTAGCTATGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAG 

GlyAlaProLeuGlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgValLtuGluAsp 
4 21 GGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGAC 
CCGCGG GGAGAACCTCCGCGACGGTCCCGGGACCGCGTACCGCAGG CCCAAGACCTTCTG 

Thr 

GlyValAsnTyrAlaThrGlyAsilLeuProGlyCysSerPheSerllePheLeuLeuAla 
481 GGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCTTCCTTCTGGCC 

CCGCACTTGATACGTTGTCCCTTGGAAGGACCAACGAGAAAGAGATAGAAGGAAGACCGG 

I^uI^uSerCysLeuThrValPrxiAlaSerAlaTyrGlnValAraAsnSerThxGlyLeu 
541 CTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCTACCAAGTGCGCAACTCCACGGGGCTT 
GACGAGAGAACGAACTGACACGGGCGAAGCCXSGATGGTTCACGCGTTGAGGTGCCCCGAA 

TyxHisValThrAsnAspCysProAsnSerSerlleValTyrGluAlaAlaAspAlalle 
601 TACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGCCGATGCCATC 
ATGGTGCAGTGGTTACTAACGGGATTGAGCTCATAACACATGCTCCGCCGGCTACGGTAG 

LeuHisThrProGlyCysValProCysValArgGluGlyAsnAlaSerArgCysTrpVal 
661 CTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAACGCCTCGAGGTGTTGGGTG 
GACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTGCGGAGCTCCACAACCCAC 
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FIG. I7-2 

AlaMetThrProThrValAlaThrArgAspGlyLysLeuProAlaThrGlnLeuArgArg 
721 GCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGACGCAGCTTCGACGT 
CGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGTCGAAGCTGCA 

HisIleAspLeuLeuValGlySerAlaThrLeuCysSerAlaLeuTyrValGlyAspLeu 
781 CACATCG ATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACGTGGGGGACCTA 
GTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGGGAGATGCACCCCCTGGAT 

CysGlySerValPheLeuValGlyGlnLeuPheThrPheSerProArgArgHisTrpThr 
841 TGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCTCTCCCAGGCGCCACTGGACG 
ACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGGGTCCGCGGTGACCTGC 

ThrGlnGlyCysAsnCysSerlleTyrProGlyHisIleThrGlyHisArgMetAlaTrp 
901 ACGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCGCATGGCATGG 
TGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGCGTACCGTACC 

Val 

As pMetMetMetAs nTrpSer P r oThrThr Al aLeuVa LMetAl aG 1 nLeuLeuAr g 1 1 e 
961 GAXATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATGGCTCAGCTGCTCCGGATC 
C TATAC TACTACTTGACCAGGG GATG CTGCCGC AACCATTACCG AG TCGACGAGGCCTAG 

ProGlnAlalleLeuAspMetlleAlaGlyAlaHisTrpGlyVallieuAlaGlylleAla 
1021 CCACAAGCCATCTTGG ACATG ATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCATAGCG 
GGTGTTCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCTCAGGACCGCCCGTATCGC 

TyrPheSerMetValGlyAsnTrpAlaLysValljeuValValLeuLeuLeuPheAlaGly 
1081 TATTTCTCCAXGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGCTATTTGCCGGC 
ATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCG 

ValAspAlaGluThrHisValThrGlyGlySerAlaGlyHisThrValSerGlyPheVal 
1141 GTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCACACTGTGTCTGGATTTGTT 
CAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCACGGCCGGTGTGACACAGACCTAAACAA 

SerLeuLeuiaaProGlyAlaLysGlnAsnValGlriLeulleAsnThrAsnGlySerTrp 
1201 AGCCTCCTCGCACCAGGCGCCAAGCAGAACGTCCAGCTGATCAACACCAACGGCAGTTGG 
TCGGAGGAGCGTGGTCCGCGGTTCGTCTTGCAGGTCGACTAGTTGTGGTTGCCGTCAACC 

HisLeuAsnSerThrAlaLeuAsnCysAsnAspSerLeuAsnThrGlyTrpLeuAlaGly 
1261 CACCTCAATAGCACGGCCCTGAACTGCAATGATAGCCTCAACACCGGCTGGTTGGCAGGG 
GTGGAGTTATCGTGCCGGGACTTGACGTTACTATCGGAGTTGTGGCCGACCAACCGTCCC 


LeuPheTyrHisHisLysPheAsnSer^rGlyCysProGluArgLeuAlaSerCysArg 
1321 CTTTTCTATCACCACAAGTTCAACTCT1CAGGCTGTCCTGAGAGGCTAGCCAGCTGCCGA 
GAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGCT 

ProLeuThrAspPheAspGlnGlyTrpGlyProIleSerTyrAlaAsnGlySerGlyPro 
1381 CCCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAGTTATGCCAACGGAAGCGGCCCC 
GGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCTTCGCCGGGG 

AspGlnArgProTyrCysTrpHisTyrProProLysProCysGlylleValProAlaLys 
1441 G ACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCGAAG 
CTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTGGAACGCCATAACACGGGCGCTTC 

SerValCysGlyProValTyrCysPheThrProSerProValValValGlyThrThrAsp 
1501 AGrcTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTGGTCGTGGGAACGACCGAC 
1CACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCCTTGCTGGCTG 

ArgSer GlyAlaPr oThrTy r S erTrpG lyG luAsnAspThr AspVal PheVal LeuAsn 
1561 AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATACGGACGTCTTCGTCCTTAAC 
TCCAGCCCGCGCGGGTGGATGTCGACCCCACTTTTACTATGCCTGCAGAAGCAGGAATTG 

AsnThrArgProProLeuGlyAsnTrpPheGlyCysThxTrpMetAsnSerThrGlyPhe 
1621 AATACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACCTGGATGAACTCAACTGGATTC 
TTATGGTCCGGTGGCGACCCGTTAACCAAGCCAACATGGACCTACTTGAGTTGACCTAAG 
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FIG. 1 7-3 

ThrLysValCy sGlyAl a ProProCysVa 1 1 leGlyGlyAlaGlyAsnAsnThrLeuHi s 
1681 ACCAAAGTGTGCGGAGCGCCTCCTTGTGTCATCGGAGGGGCGGGCAACAACACCCTGCAC 
TGGTTTCACACGCCTCGCGGAGGAACACAGTAGCCTCCCCGCCCGTTGTTGTGGGACGTG 

CysProThrAspCysPheArgLysHisProAspAlaThrTyrSerArgCysGlySerGly 
1741 TGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCCACATACTCTCGGTGCGGCTCCGGT 
ACGGGGTGACTAACGAAGGCGTTCGTAGGCCTGCGGTGTATGAGAGCCACGCCGAGGCCA 

I>eu 

ProTrpIleThrProArgCysLeaValAspTyrProTyrArgLeuTrpHisXyrPrcCys 
1801 CCC TGG ATCACACCCAGGTGCC TGGTCGACTACCCG TATAGGCTTTGGCATT ATCC TTG T 
GGGACCTAGTGTCGGTCCACGGACCAGCTGATGGGCATATCCGAAACCGTAATAGGAACA 

ThrlleAsnTyrThrllePheLysIleArgMetTyrValGlyGlyValGluHisArgLeu 
1861 ACCATC AACTACACCATATTTAAAATC AGG ATGTACG TGGGAGGGG TCGAAC AC AGGC TG 
TGG TAG TTG ATGTGGTATAAATTTTAG TCC TACATGCACCCTCCCCAGCTTGTG TCCG AC 

G 1 uAl aAl aCy sAs n Tr pThr ArgGlyG luAr gCy s As pLeuGl uA spArgAspArg Ser 
1921 GAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCC 
CTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGG 

GluLeuSerProLeuI^uI^uThrThrThrGlnTrpGlnValLeuProCysSerPheThr 
1981 GAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACA 
CTCGAGTCGGGCAATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGT 

ThrLeu ProAl aLeuSer ThrG lyLeu 1 1 e Hi sLeuH i sGlnAs n 1 1 eValAspVa 1G 1 n 
2041 ACCCTACCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGAACATTGTGGACGTGCAG 
TGGGATGGTCGGAACAGGTGGCCGGAGTAGGTGGAGGTGGTCTTGTAACACCTGCACGTC 

TyrLeuTyrGlyValGlySerSerlleAlaSerTrpAlalleLysTrpGluTyrValVal 
2101 TACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT 
ATGAACATGCCCCACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAA 

LeuLeuPheLeuLeiiLeuAlaAspAlaArgValCysSerCysLeuTrpMetKetl^uLeu 
2161 CTCCTG TTCCTTCTGCTTGCAG ACGCGCGCGTCTGCTCCTGCTTGTGG ATGATGCTACTC 
GAGGACAAGGAAGACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAG 

ileSerGli^aGluAlaAlaLeuGluAsnLeuVallleLeviAsnAlaAiaSerLeuAla 
2221 ATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCC 
TATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAATTACGTCGTAGGGACCGG 

GlyThrHisGlyLeuValSerPheljeuValPhePheCysPheAlaTrpTyrLeuLysGly 
2281 GGG ACG CACGGTCTTGT ATCCTTCCTCGTG TTCTTCTGCTTTGC ATGpTATTTG AAGGG T 
CCCI^CGTGCCAGAACATAGGAAGGAGCACAAGAAGACGAAACGTACCATAAACTTCCCA 

Ly s Tr pVal P roGlyAlaVa ITyr Thr PheTyrGlyMetTrpProLexaLeuLeuLeuLeu 
2341 AAG TGGGTGCCCGG AGCGGTCTACACCTTCTACGGGATGTGGCCTCTCCTCCTG CTCCTG 
TTCACCCACGGGCCTCGCCAGATGTGGAAGATGCCCTACACCGGAGAGGAGGACGAGGAC 

LeuMaLeuProGlnArgAlaTyrMal^uAspThrGluValAlaAlaSerCysGlyGIy 
2401 TTGGCGTTGCCCCAGCGGGCGTACGCGCTGGACACGGAGGTGGCCGCGTCGTGTGGCGGT 
AACCGCAACGGGGTCGCCCGCATGCGCGACCTGTGCCTCCACCGGCGCAGCACACCGCCA 

ValValLeuValGlyLeuMetAlaLeuThrLeuSerProTyrTyrLysArgTyrlleSer 
2461 GTTGTTCTCGTCGGGTTGATGGCGCTGACTCTGTCACCATATTACAAGCGCTATATCAGC 
CAACAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATAATGTTCGCGATATAGTCG 

Asn 

Tr pCy sLeuTrpTr pLeuG InTyrPheLeuThrArgVa lGluAlaG InLeuH i sVa ITr p 

2521 TGGTGCTTGTCGTCGCTTCAGTATTTTCTC^ 

ACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTTCGCGTTGACGTGCACACC 

1 1 e Pr oProLeuAs nVal Ar gG 1 yG ly Ar g As pAl aV a 1 1 leLeuLeuMetCy s Ala Val 
2581 ATTCCCCCCCTCAACGTCCGAGGGGGGCGCGACGCCGTCATCTTACTCATGTGTGCTGTA 
TAAGGGGGGGAGTTGCAGGCTCCCCCCGCGCTGCGGCAGTAGAATGAGTACACACGACAT 
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FIG. I7-4 , , 

Hi s P roThrLeuVa 1 PheAsp 1 1 eThrLy s LeuLeuLeuAl aval P heGly Pr oLeuTr p 
2641 CACCCGACTCTGGTATTTGACATCACCAAATTGCTGCTGGCCGTCTTCGGACCCCTTTGG 
GTGGGCTGAGACCATAAACTGTAGTGGTTTAACGACGACCGGCAGAAGCCTGGGGAAACC 

IleLeuGlnAlaSerLeuLeuLysValProTyrPheValArgValGlnGlyLeuLeuArg 
2701 ATTCTTCAAGCCAGTTTGCTTAAAGTACCCTACTTTGTGCGCGTCCAAGGCCTTCTCCGG 
TAAGAAGTTCGGTCAAACGAATTTCATGGGATGAAACACGCGCAGGTTCCGGAAGAGGCC 

PheCysAlaLeuAlaArgLysMetlleGlyGlyHlsTyrValGlnMetValllelleLys 
2761 TTCTGCGCGTTAGCGCX5GAAGATGATCGGAGGCCATTACGTGCAAATGGTCATCATTAAG 
AAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGTAATGCACGTTTACCAGTAGTAATTC 

LeuGlyAlaLeuThrGlyThrTyrValTyrAsnHisLeuThrProLeuArgAspTrpAla 
2821 TTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACTCCTCTTCGGGACTGGGCG 
AATCCCCGCGAATGACCGTGGATACAAATATTGGTAGAGTGAGGAGAAGCCCTGACCCGC 

HisAsnGlyLeuArgAspLeuAlaValAlaValGluProValValPheSerGlnMetGlu 
2881 GACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTCGTCTTCTCCCAAATGGAG 
GTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAGCAGAAGAGGGTTrACCTC 

ThrLysLeulleThrTrpGlyAlaAspThrAlaAlaCysGlyAspIlelleAsnGlyLeu 
2941 ACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGAC^TCATCAACGGCTTG 
TGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCGACTGTAGTAGTTGCCGAAC 

ProVa 1 Ser Al aArg Ar gGlyArgG 1 ul leLeuLeuGly ProAl aAspGlyMetVa 1 Ser 
3001 CCTGTTTCCGCCCGCAGGGGCCGGGAGATACTGCTCGGGCCAGCOGATGGAATGGTCTCC 
GGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGTCGGCTACCTTACCAGAGG 

LysGlyTrpArgl^uI^uAlaProIleThrAlaTyrAlaGlnGlnThrArgGlyLeuLeu 
3061 AAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTACGCCCAGCAGACAAGGGGCCXCCTA 
TTCCCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTCGTCTGTTCCCCGGAGGAT 

GlyCysIleileThrSerLeuThxGlyArgAspLysAsiiGliiValGluGlyGluValGln 
3121 GGGTCCATAATCACCAGCCTAACTGGCCGGGACAAAAACCAAGTGGAGGGTCAGGTCCAG 
CCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTC 

IleValSerThrAlaAlaGlnThrPheLeuMaThrCysIleAsnGlyValCysTrpThr 
3181 ATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACGTGCATCAATGGG^ 

TAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGA 

ValTyrHisGlyAlaGlyThrArgThrlleAlaSerProLysGlyProVallleGlTiMet 
3241 G TCTACCACGGGGCCGG AACGAGGACCATC GCGTCACCCAAGGGTC CTGTCATCCAG ATG 
CAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTTCCCAGGACAGTAGGTCTAC 

Ser Thr 

TyrThrAsnValAspGlnAspLeuValGlyTrpProAlaProGlnGlySerArgSerLeu 
3301 TATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCGCAAGGTAGCCGCTCAT'PG 
ATATGGTTACATCTGGTTCTGGAACACCCG ACCGGGCG AGGCGTTCCATCGGCGAGTAAC 

ThrProCysThrCysGlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallle 
3361 ACACCCTGC ACTTGCGGCTCCTCGG ACCTTTACCTGGTCACGAGGCACGCCGATGTCATT 
TGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGCTCCGTGCGGCTACAGTAA 

ProValArgArgArgGlyAspSerArgGlySerl^euLeuSerProArgProIleSerTyr 
3421 CCCGTGCGCCGGCGGGGTGATAGCAGGGGC AGCCTGCTGTCGCCCCGGCCC ATTTCCTAC 
GGGCACGCGGCCGCCCCACTATCGTCCCCGTCGGACGACAGCGGGGCCGGGTAAAGGATG 

L^uLysGlySerSerGlyGlyProLeuLeuCysProAlaGlyHisAlaValGlyllePhe 
3481 TTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGGCACGCCGTGGGCATATTT 
AACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCCGTGCGGCACCCGTATAAA 

ArgAlaAlaValCysThrArgGlyValAlaLysAlaValAspPhelleProValGluAsn 
3541 AGGGCCGCGGTGTGCACCCGTGGAGTGGCTAAGGCGGTGGACTTTATCCCTGTGGAGAAC 
TCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCCACCTGAAATAGGGACACCTCTTG 
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FIG. 1 7-5 

LeuGluThrThrMetArgSerProvalPneThrAspAsnSerSerProProValValPro 
3601 ctagag acaaccatgaggtccccggtgttc acgg ataactcctctccaccagtagtgccc 
gatctctgttggtactccaggggccacaagtgcctattcaggagaggtggtcatcacggg 

GlnSerPheGlnValAlaHisLeuHisAlaProThrGlySerGlyLysSerThrLysVal 
3661 CAGAGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGCGGCAAAAGCACCAAGGTC 
GTCTCGAAGGTCCACCGAGTGGAGGTACGAGGGTGTCCGTCGCCGTTTTCGTGGTTCCAG 

ProAlaAlaTyrMaMaGlnGlyTyrLysValLeuvaJXeuAsnProSerValAlaAla 

3721 CCGGCIK3CATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAACCCCTCTGTTGCTGCA 
GGCCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTGGGGAGACAACGACGT 

Leu 

ThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThr 

3 7 S 1 ACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGATCCTAACATC AGGACC 

TGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTAGGATTGTAGTCCTGG 

Gly Val ArgThr 1 1 eThr ThrGl y Ser Pr o 1 1 eThrTy rSer ThrTyrGlyLy s PheLeu 
3841 GGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACCTACGGCAAGTTCCTT 
CCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGGATGCCGTTCAAGGAA 

AlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSer 
3901 GCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGTGACGAGTGCCACTCC 
CGGCTCCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACACTGCTCACGGTGAGG 

Thr Asp Al a ThrSe r I leLeuG 1 y 1 1 eGl yThrVa iLeuAs pG InA 1 aGluThr Al aGly 
3961 ACGG ATGCCACATC CATCTTGGGCATCGGCACTGTCC TTGACCAAGCAGAGACTG CGGGG 
TGCCTACGGTGTAGGTAGAACCCGTAGCCG1GACAGGAACTGGTTCGTCTCTGACGCCCC 

Al aAr gLeuVa 1 Va lLeuAl aThr A 1 aThr ProProGl ySerValThrVal P r oH i s P r o 
4021 GCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTCACTGTGCCCCATCCC 
CGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAGTGACACGGGGTAGGG 

AsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPheTyrGlyLysAlalle 
4081 AACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTTTACGGCAAGGCTATC 
TTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAAATGCCGTTCCGArAG 

ProLeuGluVallleLysGlyGlyArgHisLeullePheCysHisSerLysLysLysCys 
4141 CCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCATTCAAAGAAGAAGTGC 
GGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTAAGTTTCTTCTTCACG 

As pG luLeuAlaAL aLy s LeuValAl aLeuG ly I leAsnAlaVal AlaTyrTyrArgGly 

4 201 GACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTGGCCTACTAC CGCGGT 

CTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCACCGGATGATGGCGCCA 

LeuAspValSerVallleProThrSerGlyAspValValValValAlaThrAspAlaLeu 

4261 CTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTGGCAACCGATGCCCTC 
GAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCACCGTTGGCTACGGGAG 

Tyr 

MetThrGly ITyrThrGlyAspPheAspSerVa 1 IleAspCysAsnThrCysVal ThrGlti 
4321 ATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAATACGTGTGTCACCCAG 
TACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTATGCACACAGTGGGTC 

Ser 

ThrValAspPheSerLeuAspProThr PheThr IleG 1 uThr I leThrLeuProG InAs p 
4381 ACAGTCGArTTCAGCCTTGACCCTACCTlCACCATTGAGACAATCACGCTCCCCCA 

TGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAGTGCGAGGGGGTCCTA 

Al aValSer Ar gThrG InAr g ArgG lyArg ThrG lyAr gG lyLy sP roGly I leTy r Arg 
4441 GCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAGCCAGGCATCTACAGA 
CGACAGAGGGCGTGAGTTGCAGGCCCGTCCTGACCGTCCCCCTTCGGTCCGTAGATGTCT 

PheValAlaProGlyGluArgProSerGlyMetPheAspSerSerValLeuCyaGluCys 

4501 TTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCCGTCCTCTGTGAGTGC 
AAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGGCAGGAGACACTCACG 
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FIG. 1 7-6 

Tyx As pAlaGly Cy s Al aTr pTyrG 1 uLeu Thr ProAl aGluThrThrVa lArgLeuArg 
4 561 TATGACGCAGGCTGTGCTTCGTATGAGCTCACGCCCGCCGAGACTACAGTTAGGCTACGA 
ATACTG CGTCCG ACACGAACCATAC TCG AG TGCGGGCGGCTCTGATGTCAATCCG ATGCT 

AlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeuGluPheTrpGluGly 
4621 GCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTrTGGGAGGGC 
CGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAACTTAAAACCCTCCCG 

ValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGlnThrLysGlnSerGly 
4 681 GTCTTTACAGGCCTCACTCATATAG ATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 
CAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTCTGTTTCGTCTCACCC 

GluAsnl^euProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaGlriAlaPro 
4741 GAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCTCAAGCCCCT 
CTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGATCCCGAGTTCGGGGA 

ProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLysProThrLeuHisGly 
4 801 CCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAGCCCACCCTCCATGGG 
GGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTCGGGTGGGAGGTACCC 

ProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIleThrLeuThrHisPro 
486X CCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATCACCCTGACGCACCCA 
GGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAGTGGGACTGCGTGGGT 

ValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluValValThrSerThrTrp 
4 921 GTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTCGTCACGAGCACCTGG 
CAGXGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAGCAGTGCTCGTGGACC 

ValLeuValGlyGlyValLeuAlaAlalieuAlaAlaTyrCysLeuSerThrGlyCysVal 

4 981 GTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTGTCAACAGGCTGCGTG 

CACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGACAGTTGTCCGACGCAC 

VallleValGlyArgValValLeuSerGlyLysProAlallelleProAspArgGluVal 
5041 GTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATACCTGACAGGGAAGTC 
CAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTATGGACTGTCCCTTCAG 

I^uTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeuProTyrlleGluGln 
5101 CTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACATCGAGCAA 
GAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAATGGCATGTAGCTCGTT 

GlyMet^Metl^uAlaGluGlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrAlaSer 
5161 GGGATGATGCTCGCCGAGCAGTTCAAGCAG AAGGCCCTCGGCCTCCTGCAGACCGCGTCC 
CCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAGGACGTCTGGCGCAGG 

ArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGlnLysLeuGluThrPhe 

5 221 CGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAAAAACXCGAGACCTTC 

GCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTCACCGTTTrTGAGCTCTGGAAG 

TrpAlaLysHisWetTrpAsnPhelleSerGlylleGlnTyTLeuAlaGlyLeuSerThr 
5281 TGGGCG AAGCATATGTGG AACTTWTCAGTGGGATAC^TACTTGGCGGGCTTGTCAACG 
ACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAACCGCCCGAACAGXTGC 

LeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAlaAlaValThrSerPro 
5341 CTCCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTOTACAGCTGCIX3TCACCAGCCCA 
GACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGACGACAGTGGTCGGGT 

LeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeu 
5401 CTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGGGTGGCTGCCCAGCTC 
GAXTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACCCACCGACGGGTCGAG 

AlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAlaGlyAlaAlalleGly 
5461 GCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCTGGCGCCGCC ATCGGC 
CGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGACCGCGGCGGTAGCCG 

SerValGlyLeuGlyLysValLeuIleAspilelieuAlaGlyTyrGlyAlaGlyValAla 
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FIG. I7-7 

5521 AGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTATGGCGCGGGCGTGGCG 
TCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATACCGCGCCCGCACCGC 

Gly 

GlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSerThxGluAspLeuVal 
5581 GGAGCTCTTGTGGC ATTCAAGATCATGAGCGGTGAGGTCCCCTCCACGGAGGACCTGGTC 
CCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGGTGCCTCCTGGACCAG 

AsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGlyValValCysAlaAla 
5641 AATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGCGTGGTCTGTGCAGCA 
TTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCGCACCAGACACGTCGT 

IleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeulle 

5701 ATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGC AGTGG ATG AACCGGCTGATA 
TATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACCTACTTGGCCGACTAT 

AlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrValProGluSerAspAla 
5761 GCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTGCCGGAGAGCGATGCA 
CGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCACGGCCTCTCGCTACGT 

HisCys 

AlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGlnLeuLeuArgArgLeu 
5821 GCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAGCTCCTGAGGCGACTG 
CGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTCGAGGACTCCGCTGAC 

HisGlnTrpIleSerSerGluCysThrThrProCysSerGlySerTrpLeuArgAspIle 
5881 C ACCAGTGG ATAAGCTCGGAGTG TACCAC TCCATGCTCCGGTTCCTGGCTAAGGGACATC 
GTGGOTCACCTATTGGAGCCTCACATGGTGAGGTACGAGGCCAAGGACCGATTCCCTGTAG 

TrpAspTrpI leCy sG luValLeuSer Asp PheLysThrTrpLeuLy sAl aLy sLeuMet 
5941 TGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTAAAAGCTAAGCTCATG 
ACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGATTTTCGATTCGAGTAC 

ProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyrLysGlyValTrpArg 
6001 CCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTATAAGGGGGTCTGGCGA 
GGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATATTCCCCCAGACCGCT 

Gly 

ValAspGlylleMetHisThrArgCysHisCysGlyAlaGluIleThrGlyHisValLys 
6061 GTGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATCACTGGACATGTCAAA 
CACC TG CCGTAGTACGTGTGAGCGACGG TG ACACCTCG ACTCTAGTG ACCTGTACAG TTT 

As nGlyThrMetAr g I leValG ly ProAr gThrCy sAr g AsnMetTrpSerGly Thr Phe 
6121 AACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATGTGGAGTGGGACCTTC 
TTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTCTACACCTCACCCTGGAAG 

Prol leAsnAlaTyrThrThrt ly ProCysTta 
6181 CCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGC^CGAACTACACGrTC 
GGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGCGGCTTGATGTGCAAG 

AlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGlnValGlyAspPheHis 
6241 GCGCTATGGAGGGTGTCTGCAGAGGAATATGTGGAGATAAGGCAGGTGGGGGACTTCCAC 
CGCGATACCTCCCACAGACGTCTCCTTATACACCTCTATTCCGTCCACCCCCTGAAGGTG 

TyrValThrGlyMetThrThrAspAstiLeuLysCysProCysGlnValProSerProGlu 
6301 TACGTGACGGGTATGACTACTGACAATCTCAAATGCCCGTGCCAGGTCCCATCGCCCGAA 
ATGCACTGCCCATACTGATGACTGTTAGAGTTTACGGGCACGGTCCAGGGTAGCGGGCTT 

PhePheThrGluLeuAspGlyVaiArgLeuHisArgPheAlaProProCysLysProLeu 
6361 TTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCCCCCTGCAAGCCCTTG 
AAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGGGGGACGTTCGGGAAC 

LeuArgGluGluValSerPheArgValGlyLeuHisGluTyrProValGlySerGlnLeu 
6421 CTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCGGTAGGGTCGCAATTA 
GACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCMATGGGCCAXCCCAGCGTTAAT 
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FIG. 1 7-8 

ProCysGluProGluProAspValAlaValLeuThrSerMetLeuThrAspProSerHis 
6481 CCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTCACTGATCCCTCCCAT 
GGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAGTGACTAGGGAGGGTA 

I leThr AlaGl uAl aAl aG lyAr gAr gLeuAl aAr gG ly Ser Pr oP r oSerVal Al aS er 
6541 ATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCCCCCTCTGTGGCCAGC 
TATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGGGGGAGACACCGGTCG 

SerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCysThrAlaAsnHisAsp 
6601 TCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGC AACTTGCACCGCTAACCATG AC 
AGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACGTGGCGATTGGTACTG 

SerProAspAlaGluLeulleGluAlaAsnLeuLeuTrpArgGlnGluMetGlyGlyAsn 
6661 TCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAGGAGATGGGCGGCAAC 
AGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTCCTCTACCCGCCGTTG 

IleThrArgValGluSerGluAsnLysValVallleLeuAspSerPheAspProLeuVal 
6721 ATCACCAGGGTTGAGTCAGAAAAC AAAGTGGTGATTCTGGACTCCTTCGATCCGCTTGTG 
TAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGGAAGCTAGGCGAACAC 

AlaGluGluAspGluArgGluIleSerValProAlaGluIleLeuArgLysSerArgArg 
67 81 GCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTGCGGAAGTCTCGGAGA 
CGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGACGCCTTCAGAGCCTCT 

PheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnProProLeuValGluThr 

6841 TTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCCCCGCTAGTGGAGACG 
AAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGGGGCGATCACCTCTGC 

TrpLysLysProAspTyrGluProProValValHisGlyCysProLeuProProProLys 
6 901 TGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGTCCGCTTCCACCTCCAAAG 
ACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACAGGCGAAGGTGGAGGTTTC 

SerProProValProProProAxgLysLysArgThrValValLeuThrGluSerThrLeu 
6961 TCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACXjGTGGTCCTCACTGAATCAACCCTA 
AGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAGTGACTTAGTTGGGAT 

Ser 

SerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSerSexThrSerGlylle 
7021 TCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCCTCAACTTCCGGCATT 
AGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGGAGTTGAAGGCCGTAA 

ThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGlyCysproProAspSer 
7081 ACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGCTGCCCCCCCGACTCC 
TGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCGACGGGGGGGCTGAGG 

PheAla 

AspAlaGluSerTyrSerSerMetProProLeuGluGlyGluProGlyAspProAspLeu 
7141 GACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCTGGGGATCCGGATCTT 
CTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGACCCCTAGGCCTAGAA 

SerAspGlySerTrpSexThrValSerSerGluAlaAsnAlaGluAspValValCysCys 
7201 AGCGACGGGTCATGGTCAACGGTCAGXAGTGAGGCCAACGCGGAGGATGTCGTGTGCTGC 
TCGCTGCCCAGTAC CAGTTGCCAGTC ATCACTCCGG TTGCGCCTCCTACAGCACACGACG 

SerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAlaAlaGluGluGlnLys 
7261 TCAATG TCTTACTC TTGGAC AGGCGCACTCGTC ACCCCGTGCGCCG CGG AAGAAC AGAAA 
AG TTAC AGAATGAG AACCTGTCCGCGTGAGCAGTGGGGCACGCGGCGCCTTCTTGTCTTT 

LeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsnLeuValTyrSerThr 
7321 CTGCCCATCAATGCACTAAGCAACrcGTTGCTACGTCACCACAATTTGGTGTATTCCACC 
GACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTAAACCACATAAGGTGG 

ThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAspArgLeuGlnValLeu 
7381 ACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGACAGACTGCAAGTTCTG 
TGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTGTCTGACGTTCAAGAC 
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FIG. 1 7-9 

AspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAlaSerLysValLysAla 
7441 GACAGCCATTACCAGG ACGTACTCAAGGAGGTTAAAGCAGCGGCGTCAAAAGTG AAGGCT 
CTGTCGGTAATGGTCCTGCATGAGTOX:CTC<^TTTCGTCGCCGCAGTTTTCACTTCCGA 

AsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHisSerAlaLysSerLys 
7501 AACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACACTCAGCCAAATCCAAG 
TTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTGAGTCGGTTTAGGTTC 

PheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAlaValThrHtsIleAsn 
7561 TTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCCGTAACCCAC ATCAAC 
AAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGGCATTGGGTGTAGTTG 

SerVa 1 Tr pLy s As pLeuLeuG 1 uAs p Asn Val Thr Pro 1 1 e AspThr Thr 1 1 eMe tAl a 
7621 TCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGACACTACCATCATCGCT 
AGGCACACCTTTCTGGAAGACCTTCTGTTACAT1X5TGGTTATCTGTGATGGTAGTACCGA 

Ly s AsnG luVal PheCysVa 1G In P r oGl uLy sG lyG lyArgLy sProAl aArgLeu I le 
7681 AAGAACGAGGTTTTCTGCGTTCAGCCTGAG AAGGGGGGTCGTAAGCCAGCTCGTCTC ATC 
TTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTCGGTCGAGCAGAGTAG 

ValPheProAspLeuGlyValArgValCysGluLysMetAlaLeuTyrAspValValThr 
7741 GTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTGTACGACGTGGTTACA 
CACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAACATGCTGCACCAATGT 

LysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArg 
7801 AAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATACTCACCAGGACAGCGG 
TTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATGAGTGGTCCTGTCGCC 

ValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMetGlyPheSerTyrAsp 
7861 GTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATGGGGTTCTCGTATGAT 
CAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTACCCCAAGAGCATACTA 

♦ 

ThrArgCysPheAspSerThrValThrGluSerAspIleArgThxGluGliiAlaIleTy r 
7921 ACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACGGAGGAGGCAATCTAC 

TGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGCCTCCTCCGTTAGATG 

GlnCysCy sAspLeuAspProGlnAl aArgValAlalleLy sSerLeuThxOluArgLeu 
7981 CAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCCCTCACCGAGAGGCTT 
GTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGXAGTTCAGGGAGTGGCTCTCCGAA 

Gly 

Ty rValGl yG ly ProLeuThxAsnSer ArgGlyGluAs nCy sGlyTyrArgAr gCy sAr g 
8041 TATCTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGCTATCGCAGGTGCCGC 
ATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCGATAGCGTCCACGGCG 

AlaSerGlyVall^uThrThrSerCysGlyAsnThrLeuThrCysTyrlleLysAlaArg 
8 101 GCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGCTA(^TCMGGCCCOT 
CGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACGATGTAGTTCCGGGCC 

AlaAlaCysArgAiaAlaGlytieuGliiAspCysThrMetLeuValCysGlyAspAspLeu 
8161 GCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTGTGTGGCGACGACTTA 
CGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCACACACCGCTGCTGAAT 

ValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSerLeuArgAlaPheThr 
8221 GTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGCCTGAGAGCCTTCACG 
CAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCGGACTCTCGGAAGTGC 

GliiAlaMetThrArgTyrSerAlaProProGlyAspProProGlnProGluTyrAspLeu 
8281 GAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAACCAGAATACGACTTG 
CTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTTGGTCTTATGCTGAAC 

GluJJeuIleThrSerCysSerSerAsnValSerValAlaHisAspGlyAlaGlyLysArg 
8341 GAGCOX^TAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGACGGCGCTGGAAAGAGG 
CTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTGCCGCGACCTTTCTCC 
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ValTyrTvrLeuThrArgAspProThrThrProLeuAlaArgAlaAlaTrpGluThrAla 
8401 GTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCTGCGTGGGAG ACAGCA 
CAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGACGCACCCTCTGTCGT 

ArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPheAlaProThrLeuTrp 
8461 AGACAC ACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTTGCCCCCACACTGTGG 
TCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAACGGGGGTGTGACACC 

AlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAlaArgAspGlnLeuGlu 
8521 GCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCCAGGGACCAGCTTGAA 
CGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGGTCCCTGGTCGAACTT 

GlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGluProLeuAspLeuPro 
8581 CAGGCCCTCGATTGCGAG ATCTACGGGGCCTGCTACTCCATAGAACCACTTGATCTACCT 
GTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTTGGTGAACTAGATGGA 


ProIlelleGlnAxgLeuHisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGly 
8641 CCAATCATTCAAAG ACTCCATGGCCTCAGCGCATTTTCACTCCACAGTTACTCTCCAGGT 
GGTTAG TAAGTTTC TG AGGTACCGG AGTCGCGTAAAAGTGAGGTG TCAATGAG AGGTCCA 

GluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValProProLeuArgAlaTrp 
8701 GAAATTAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCGCCCTTGCGAGCTTGG 
CTTTAATTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACC 

Gly : 

ArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGlyGlyArgAlaAlalle 
8761 AGACACCGGGCCCGG AGCG TCCG CGCTAGGCTTCTGGCCAGAGG AGGCAGGGCTGCC ATA 
TCTCTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTAT 

CysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
8821 TGTGGC AAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAAC 
ACACCG TTC ATGGAGAAGTTGACCCGTCATTCTTGTTTCGAG TTTG 


FIG. 17-10 
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IMMUNOLOGICAL SCREENING IN BACTERIA 


FIG. 1 8 
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FIG. 1 9- N 
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Some conserved co-linea r peptides in HCV & Flaviviruses 


NS3 region 


NS5 

Highly-conserved 
Polymerase 
region 


Flaviviruses 
(Yellow Fever, 
West Nile, Dengue) 


TATPPG SAAQRRGRIGRNP— GDDCVV 


****** * ***** ** 


* * * * * 


HCV 


TATPPG SRTQRRGRTGRGK 


• — GDDLVV 


#1348 


#1483 


#2737 


FIG. 2 I 
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